Discovering containers

A container is a package consisting of the application, its dependencies, libraries, and configuration. It is a method of virtualizing the OS layer and is often used to isolate an application into modular software pieces. Containers are not virtual machines (VMs). They are a lighter-weight solution than VMs as they share the kernel of the underlying OS. Containers run on container management software, such as the Docker Engine, and are commonly managed by using a container management (or orchestration) solution such as Kubernetes.

Discovery of software containers

Containers require additional techniques to fully discover them; the context in which we run a command on a host is crucial. For example, Java might be installed in the container but not on the host. Running a command to determine the version of Java installed on the container will fail if the command is run on the host; the command must be run in the context of the container. If a different version of Java is installed on the host and the container, the result of running the command to determine the Java version depends on the context in which the command was run, either the host or the container.

When a pattern triggers on a discovered process, it determines whether the process is running in a container or directly on a host and uses that information to manage the context and use the correct discovery commands. That is, whether to run the command on the host or in the container.

A scan of a container in the data center has two parts: SSH access and API access. SSH access discovers running processes, whether they are containers, running in containers, or one of the levels of orchestration or runtime processes. If the process is identified as a container, and it is simply running on a host without orchestration software, its discovery continues by using SSH. Patterns determine the version of the software running in the container.

If orchestration software, such as Kubernetes, is discovered, you can perform further discovery by using a separate API scan, which can determine which deployments are present, the pods in which they run, and the original images. This provides accurate versioning of the software running in the containers, which in turn makes lifecycle data such as the end-of-support and end-of-life data available for container-based software. In this case, SSH and API scans are required for complete discovery.

In this section, SSH means a login shell on the target. In practice, this would be SSH, but other kinds of shell access can be used.

With this information, BMC Discovery constructs an accurate model of the host, the containers running on that host, and the software running in those containers.

The discovery processes are the same whether a container is run and managed by orchestration or management software, and whether running on a host or in the cloud. The model constructed depends on how the container runs.

The following screenshot shows a visualization of discovered containers.

Discovery of software containers in the cloud

A scan of a container in the cloud, like a container in a data center, requires SSH access and API access. A BMC Discovery scan of a cloud uses the cloud API for discovery, and for some providers can automatically scan Kubernetes clusters.

Automatic discovery of cloud-based Kubernetes clusters occurs by default when you scan your supported cloud services. When BMC Discovery finds a Kubernetes cluster, it creates an automatic scan by using a Kubernetes token obtained from the cloud provider. Automatic scanning of Kubernetes clusters can be disabled for each scan. No additional credentials are required; the API token is generated depending on your existing privileges.

The Cluster URL must be accessible to BMC Discovery, in the cloud provider's management software, this might be referred to as enabling the public API.

Automatic scanning of Kubernetes clusters is supported in the following cloud vendors with no additional configuration:

Automatic scanning of Kubernetes clusters is supported in the following cloud vendors with additional (RBAC) configuration:

Automatic scanning of Kubernetes clusters is not supported in OpenStack.

Amazon Web Services

In a cloud scan of Amazon Web Services (AWS), BMC Discovery automatically scans Kubernetes clusters. For example, for an EKS cluster with worker nodes, an AWS scan finds the EKS cluster, and scans the EKS API. The scan finds the worker nodes, and BMC Discovery then uses SSM to discover the worker nodes. Permission escalation is available using sudo, and nerdctrl is available by default to access containerd.

The only additional setup required is to authorize EKS API scans.

Google Cloud Platform

In a scan of Google Cloud Platform (GCP), BMC Discovery automatically scans Kubernetes clusters. If IAP is enabled, it uses the auto-provisioned SSH key to discover the worker nodes.

The only additional setup required is to configure sudo on the worker nodes.

Oracle Cloud Infrastructure

In a scan of Oracle Cloud Infrastructure (OCI), BMC Discovery automatically scans Kubernetes clusters with no additional setup.

If you have configured a bastion, you can discover worker nodes. You must configure sudo on the worker nodes.

IBM Cloud

In a scan of IBM Cloud, BMC Discovery can only use API scans to discover Kubernetes. It cannot access worker nodes.

Azure

In a scan of Azure, BMC Discovery can only use API scans to discover Kubernetes. It cannot access worker nodes.

Restrictions

BMC Discoverydoes not currently discover network connections to and from containers.

BMC Discoverydoes not discover containers running on Windows.

Modeling software containers

Modeling software containers uses two kinds of Software Instance (SI), a Contained SI and a Stable SI.

Contained SI node

Contained SIs represent known software running in a container; the software that is running, for example, an Nginx server running in a container. Contained SIs show where your software is running and they can be used to model impacts in your network.

Contained SIs are stored in the Default partition. When a Contained SI is created, representing, for example, a database server, additional patterns trigger and create nodes representing components which, in the context of containers, can be regarded as ephemeral, such as tables and schemas. The nodes representing the additional components are all stored in the Contained partition. The Contained partition is not included in standard searches, but you can select it for searches.

Contained SI aging

Contained SIs and their child nodes are removed when their container is removed.

Stable SI

A Stable SI can be regarded as information about why the software is present; it represents the deployment. For example, where Nginx is running in a pod that is managed by Kubernetes, it represents the template used to deploy the particular configuration of the Nginx software that runs in the pods. It is not linked to the host on which the pods are running; it is linked to the deployment.

To make sure that the model takes into account the content of the containers, all child nodes of the Contained SI are copied to the Default partition under the Stable SI. This means that the model reflects the content of the containers, without the churn that occurs in the normal operation of containerized software.

Stable SIs show the quantity or number of instances of software running, which can be used, for example, for licensing queries and vulnerability assessments.

Stable SI aging

A deployment managed by Kubernetes is modeled as a Stable SI. The Stable SI is not aged, it is only removed when the Kubernetes resource is removed, even if all of the containers that Kubernetes is managing have been removed. This approach caters to the following situations:

A deployment has been scaled to zero, for example, for maintenance, and is likely to be scaled up again. The Stable SI is retained until the Kubernetes resource (Deployment, DaemonSet, StatefulSet) is deleted.
While scanning a large cluster, the timing of the removal and redeployment of containers means that the discovery scan misses them.

Where containers are running on a host, the Stable SI is linked to the host node. When the Contained SIs are removed, the Stable SI remains and is aged out as a 'normal' SI.

Example using Kubernetes

In this example, a single instance of Nginx is managed by Kubernetes. The Kubernetes Pod containing Nginx is deployed to three hosts. The instances in Kubernetes and Nginx running in the pods are modeled as follows:

Each Nginx server running in the Kubernetes pods on a worker node is modeled as a Contained SI. They are replicas of the deployment managed by Kubernetes.
The Nginx deployment managed by Kubernetes is modeled as a Stable SI.

Example with a container running on a host

In this example, Nginx is running in a container that is running on a host. It was installed by a user running commands on the host.

The Nginx server running in the container is modeled as a Contained SI.
There is no Nginx deployment as the container was not deployed but installed. However, the Stable SI is linked to the host on which the container runs.

Model diagrams

The following diagram shows the model for containers running in Kubernetes:

Container Model - Kubernetes.png

The following diagram shows the model for containers running on a host:

Container Model - Standalone.png

Permissions required for container discovery

To fully identify containers and the software running in them, BMC Discovery must be able to run several system tools with elevated permissions.

If a discovery run can tell that there is a container but cannot identify it, it creates a Discovery Condition on the unidentified node. Essentially, this is a note to highlight the unidentified container.

Container identification

BMC Discoverycan identify containerized processes without additional tools but to get all the container details, such as image details, other commands are required. The commands used depend on the container technologies being used on each server.

Command	Notes
docker	Used when: Docker is being used to manage containers on the host. Docker is being used as a Kubernetes runtime. This usage is the default for old Kubernetes releases. To list all docker containers, the user running the docker command must be in the Docker group, or the command must be run with elevated privileges.
podman	Used when Podman is being used to manage containers on the host. The podman command runs containers under multiple user accounts. To list all podman containers, BMC Discovery must run podman as the same user running each container. Use the runuser command to run podman with different user accounts.
nerdctl	When containerd is being used as a Kubernetes runtime. The default for modern Kubernetes releases. The nerdctl command is an optional CLI tool for containerd. The command must run with elevated privileges. For more information, see: https://github.com/containerd/nerdctl
ctrctl	When: cri-o is being used as an OpenShift runtime. This usage is the default for OpenShift. containerd is being used as a Kubernetes runtime. This usage is the default for modern Kubernetes releases. The ctrctl command is an optional CLI tool for containerd, but it is usually installed by default on OpenShift worker nodes. The command must run with elevated privileges. For more information, see: https://github.com/kubernetes-sigs/cri-tools
ctr	When containerd is being used as a Kubernetes runtime. This usage is the default for modern Kubernetes releases. The ctr command is a simple debugging tool that is part of the containerd package and is installed by default on hosts that are using containerd. The command must run with elevated privileges. However, ctr reports fewer details than other tools.
runc	When runc is being used as a low-level container runtime. The runc command is a low-level container runtime. The command must run with elevated privileges. However, runc reports fewer details about containers than other tools and does not report any information about container images.
crun	When crun is being used as a low-level container runtime. The crun command is a low-level container runtime. To list all crun containers, BMC Discovery must run the command for each user using runuser. However, crun reports fewer details about containers than other tools and does not report any information about container images.

Software discovery

To discover the software running in each container, BMC Discovery needs to be able to execute commands and retrieve files from within the container context. BMC Discovery uses the nsenter command to enter container namespaces from the host.

Command	Notes
nsenter	The nsenter command requires elevated permissions to enter container namespaces.

Elevating privileges for deep container discovery

There are two mechanisms for elevating privileges: sudo (and similar tools) or capabilities.

sudo

sudo allows a user account to run commands as the superuser (root) with granting full access. The following sudo rules allow the discovery user to run all the commands potentially required for deep container discovery.

discovery ALL=(root) NOPASSWD: /bin/docker, /bin/podman, /usr/sbin/runuser, /usr/local/bin/nerdctl, /usr/local/bin/crictl, /usr/bin/ctr, /usr/sbin/runc, /bin/crun

The exact paths could vary depending on the configuration of the target host.

Capabilities

Linux capabilities provide an alternative privilege escalation to sudo. For example, to run nsenter successfully, the following capabilities are required:

CAP_SYS_CHROOT
CAP_SYS_PTRACE
CAP_SYS_ADMIN

To configure capabilities

These capabilities can be granted to the discovery user for nsenter by using the following steps:

Edit /etc/security/capabilities.conf and add:
cap_sys_ptrace,cap_sys_chroot,cap_sys_admin discovery none *
Edit /etc/pam.d/login and add:
auth required pam_cap.so
Edit /etc/pam.d/sshd and add:
auth required pam_cap.so
Add inheritable capabilities to nsenter (requires root permissions):
sudo setcap 'cap_sys_ptrace+ie cap_sys_chroot+ie cap_sys_admin+ie' /bin/nsenter

Repeat all of the steps on each host.