Results Post-Processing

Once a query has completed, resulting in one or more sets of nodes, the set can be post-processed to summarise the results or otherwise modify them. This is achieved using a PROCESSWITH clause. (For backwards compatibility, PROCESSWITH can be specified as two words PROCESS WITH.)

Optional parameters

In the following, some function parameters are shown with a value, such as min=0. This indicates that the parameters are optional, and the values are the default used if values are not provided. The key=value syntax is not part of the search syntax – parameters must always be provided as plain values or missed out to use the defaults.

Post-processing functions may be chained together in a comma-separated list, in which case they are applied in turn, each taking the output of the previous one. For example, the following query follows from all ssh processes to processes they are communicating with:

SEARCH DiscoveredProcess WHERE cmd HAS SUBWORD "ssh"
PROCESSWITH communicationForProcesses,
localToRemote,
processesForCommunication

After a search, you can immediately refine it.

SEARCH some search...
PROCESSWITH
where @2 = "foo"
show bar, baz, @3

The syntax is identical to the existing PROCESSWITH feature, just with no PROCESSWITH functions listed.

@number refers to columns in the first search, as in current refine searches.

Named node sets and set operations

Sets resulting from searches and traversals can be given names, and then combined with set operations. For example, to find hosts running Oracle products in London:

SEARCH SoftwareInstance WHERE type HAS SUBWORD "oracle" TRAVERSE :::Host AS oracle_hosts
SEARCH Location WHERE name HAS SUBWORD "London" TRAVERSE :::Host AS london_hosts
SEARCH IN (oracle_hosts AND london_hosts) SHOW name, os

Valid operators for the sets are

and

Intersection of sets.

or

Union of sets.

-
(minus sign)

Negative intersection.

Data manipulation functions

The following functions manipulate data:

bucket(interval, min=0, max=0)

Separates data into 'buckets' representing ranges of values present. It looks at the first attribute selected with the SHOW clause, which should be either a number or a date. It starts by dividing the data into buckets with 'width' specified by interval. i.e. with an interval of 10, the first bucket contains the values between 0 and 10, the next between 10 and 20, and so on. The result then contains one row for each bucket, with two columns showing the bucket value and the number of input values in the bucket.

If provided, min and max specify the minimum and maximum number of buckets. If the number of buckets based on the interval is outside those boundaries, the interval is divided or multiplied by two until it fits. If used to group dates the interval is assumed to be in seconds. This function is mostly used to build charts.

SEARCH DiscoveryAccess
SHOW discovery_duration_sum
PROCESS WITH bucket(30,3,20)

unique(sort=0)

The unique function takes the rows of output from the search, and returns each unique row just once.

If the optional sort argument is set to 1, the result rows are sorted; if set to zero or not provided, the rows are output in the same order they appeared in the original results, with duplicates removed.

countUnique(show_node_count=1, show_total_count=1, sort=1, set_headings=None, ignore_none=1)

The countUnique function can be used, for example, to produce a summary of discovered processes:

SEARCH DiscoveredProcess SHOW cmd, args PROCESS WITH countUnique(0)

It converts the search results into a summary where each row contains a command, arguments pair and count of how many times that pair appears.

When a single list attribute is provided in the SHOW clause, countUnique counts the individual items in the list and shows two counts. The first count is the number of nodes in which the item appears; the second count is the total number of times the item appears, which will be a larger number if an items appears more than once in a list.

countUnique takes two optional boolean arguments to indicate which count columns appear.

If sort is 1, countUnique sorts the results by total; if it is set to zero, it keeps them in the order it found them in the original result set.

All boolean arguments default to 1, meaning both count columns appear and the results are sorted.

If set_headings is provided, it contains a list of strings to use as the headings in the result, overriding the default headings. The list must have the correct number of items corresponding for the number of columns (2 or 3 depending on the show parameters).

By default, values that are None are ignored; if ignore_none is set to zero, None values are counted in the same way as other values.

displayHistory(start, end, attr_count)

The displayHistory function explodes a result set with history information.

SEARCH Host
SHOW name, name, ram, processor
PROCESS WITH displayHistory(parseTime("2009-01-01"), currentTime(), 1)

The three arguments are the start date, the end date and the number of attributes to leave as is. By default the end date is now and the attribute count is 1 so this invocation could be simplified as:

SEARCH Host
SHOW name, name, ram, processor
PROCESS WITH displayHistory(parseTime("2009-01-01"))

This particular example will return a result set with one line for every change in the history of the attributes name, ram and processor, detailing on each line the date it happened, the attribute name and the attribute value before and after the change.

Provenance functions

These functions analyze provenance information:

provenanceDetails(friendly_time=0)

provenanceDetails takes all of the attributes selected in the SHOW clause and finds the provenance information for them. For each attribute it shows a row in the output containing the source node label, the attribute name, the attribute value, the time the attribute was last confirmed, and the label of the evidence node.

If the optional friendly_time argument is set to 1, the times are converted to strings; if set to 0 or not provided, times are returned in the internal time format.

provenanceFailures(friendly_time=0)

provenanceFailures is currently only supported for Host nodes. For each Host, it traverses to find the most recent DiscoveryAccess. It then finds the provenance information for each of the attributes in the SHOW clause. The output consists of one row for each attribute that was not confirmed in the most recent DiscoveryAccess. The rows contain the label of the Host node, the attribute name, the attribute value, the label of the evidence node used to set the attribute, the time the value was confirmed, and the time of the most recent DiscoveryAccess.

If the optional friendly_time argument is set to 1, the times are converted to strings; if set to 0 or not provided, times are returned in the internal time format.

Network connection functions

These functions analyze network connection data:

communicationForProcesses(targets=3, show="SUMMARY")

Given an input of DiscoveredProcesses, returns a list of node sets which varies depending on the value of targets:

targets = 1

return DiscoveredNetworkConnections

targets = 2

return DiscoveredListeningPorts

targets = 3

return both

Returns network connections and listening ports that tie up with the given set of processes – ie. the network connection or listening port comes from the same discovery access as the process, and the process ids match.

The show clause determines which attributes on the nodes are returned in tabular results. The same show clause is used for both node kinds.

The result of this function is useful for feeding in to the localToRemote or communicationToRemoteHost functions.

processesForCommunication(show="SUMMARY")

The input must contain DiscoveredNetworkConnections or DiscoveredListeningPorts or both. Returns a node set of DiscoveredProcesses.

Returns processes that tie up with the given network connections and listening ports – ie. the processes come from the same discovery access as the connections and ports, and the process ids match.

The show clause determines which attributes on the nodes are returned in tabular results.

localToRemote(targets=15, show="SUMMARY")

The input must contain DiscoveredNetworkConnections or DiscoveredListeningPorts, or both. Returns nodes corresponding to the 'other end' of the input communication information. The results depend upon the targets specification. targets is a 'bit mask' formed by adding together the numbers corresponding to the required results:

targets

result

1

remote DiscoveredNetworkConnections

2

remote DiscoveredListeningPorts

4

in-machine DiscoveredNetworkConnections

8

in-machine DiscoveredListeningPorts

The function distinguishes between 'remote' connections that are on different machines to the source information and 'in-machine' connections that are communication from one process to another on a single machine.

DiscoveredNetworkConnection nodes can be used to find both target DiscoveredNetworkConnection and DiscoveredListeningPort nodes.

DiscoveredListeningPort nodes can only be used to find target DiscoveredNetworkConnection nodes.

In all cases, only network connections and listening ports found during the most recent complete DiscoveryAccess will be considered, meaning that only 'current' data is used.

Note that it is not an error to, for example, pass in only DiscoveredListeningPorts and set targets to 2; you will simply get an empty list as a result.

The show clause determines which attributes on the nodes are returned in tabular results. The same show clause is used for both node kinds.

The result of this function is useful for feeding in to the processesForCommunication function.

hostToHostCommunication(show="SUMMARY")

Given a set of Host nodes, returns a set of Hosts that are communicating with those hosts, according to observed network connections.

A host is considered to be communicating with another if there is a network connection from either host to the other, according to the remote IP address on the network connection and the IP addresses of the host.

The show clause determines which attributes on the nodes are returned in tabular results.

communicationToRemoteHost(show="SUMMARY")

Given a list of node sets, which must contain DiscoveredNetworkConnections or DiscoveredListeningPorts (or both), return a set of Host nodes that are communicating.

A Host is returned if one of its IPs matches one of the remote IP addresses of one of the network connections, or if it has a network connection with a remote IP address and port that matches one of the listening ports.

The show clause determines which attributes on the nodes are returned in tabular results.

hostToRemoteCommunication(show="SUMMARY")

Given a set of Host nodes, returns a list of DiscoveredNetworkConnections and DiscoveredListeningPorts that are communicating with the hosts. Network connections are returned if their remote IP address corresponds to an IP address of one of the hosts, and listening ports are returned if one of the hosts has a network connection whose remote IP address and port matches the listening port.

Only network connections and listening ports found during the most recent complete DiscoveryAccess will be considered, meaning that only 'current' data is used.

The show clause determines which attributes on the nodes are returned in tabular results.

The result of this function is useful for feeding in to the processesForCommunication function.

communicatingSIs(show="SUMMARY")


Given a set of SoftwareInstance nodes, returns a set of the SoftwareInstance nodes with which they are communicating.
This function is equivalent to a traversal from SoftwareInstance to DiscoveredProcess, then a chain of communicationForProcesses, localToRemote, processesForCommunication, followed by a traversal from DiscoveredProcess back to SoftwareInstance.

In BMC Atrium Discovery version 8.2.01 and later, communicatingSIs returns SoftwareInstance nodes corresponding to both remote and in-machine communication links. Earlier versions only returned SoftwareInstance nodes on remote machines.

connectionsToUnseen

Takes a set of DiscoveredNetworkConnection nodes, and filters it to include only those ones that represent connections to addresses that have not been seen on NetworkInterface nodes present in the system. That is, it returns network connections to devices that have not been scanned.

networkConnectionInfo

Takes a set of DiscoveredNetworkConnection nodes and produces a summary of information about each one. Each row contains the name of the "local" Host, the associated local process command and arguments, port and IP address information for the connection, the remote command and arguments, and the remote host name. Not all of the information is always available, for example if the remote side of the connection had not been scanned, or if insufficient discovery permission meant ports were not associated with processes.

networkConnectionInfo takes four optional list parameters to modify the attributes shown for Hosts and DiscoveredProcesses. The first two parameters specify the attributes to show on Host nodes and the headings to use for those columns; the second two parameters specify the attributes to show on DiscoveredProcess nodes and the headings to use for those. The number of headings must match the number of attributes shown for each node kind. In both cases, the attributes and headings are used twice, once for the local end of the network connection, and once for the remote end. The headings are prefixed "Local" and "Remote" as appropriate.

The default settings are equivalent to networkConnectionInfo(["name"], ["host name"], ["cmd", "args"], ["command", "arguments"]) . It is valid to use key expressions in the attribute lists, so attributes from nodes related to the hosts or processes can be displayed.

siCommunicationSummary

Takes a set of SoftwareInstance nodes and produces a summary with one row for each observed network connection, treating the input SoftwareInstance nodes as the local end. Each row contains the local Host and SoftwareInstance, local and remote IP address and port, remote SoftwareInstance and remote Host. Values for the remote end may not be available if the remote host was not scanned, or if the connection could not be associated with a particular SoftwareInstance.

Like networkConnectionInfo, siCommunicationSummary takes four optional list parameters to modify the attributes shown for Hosts and SoftwareInstances. The default is equivalent to siCommunicationSummary(["name"], ["host name"], ["name"], ["SoftwareInstance"]). Key expressions are valid in the attribute lists, so attributes from nodes related to the Hosts or SoftwareInstances can be shown.

Chart-specific functions

These functions are only useful as input to charts:

timeSeries(column_count = 20, how_far_back = 0)

This function transforms a list of result sets with two dates into a result set of counts. An example usage is to graph the count of hosts over time. column_count determines how many data points will be present in the output, how_far_back is a time – the function will drop anything in the given set older that this date.

The function assumes that the first two columns in the passed set are dates and if another column is present returns one data set per possible value, somewhat similar to countUnique.

This function returns multiple result sets, one with a legend whose metadata dictionary contains an isLegend key. If there are more than two columns in the sources sets it returns one result set per unique value of this column:

  • The result set title will be set to the string version of the value.
  • value in the metadata is the original CORBA any containing of the split value.
  • index in the metadata preserves the order in which the split values were found in the source nodeset.

If there are only two columns, only two result sets will be returned: the legend and a list of counts, and no metadata is set.
Example of use:

SEARCH FLAGS(include_destroyed) Host
   SHOW creationTime(#), destructionTime(#)
   PROCESS WITH timeSeries()
Was this page helpful? Yes No Submitting... Thank you

Comments