Collecting additional metrics using the Sysdig agent


You can use the Sysdig agent to collect the additional metrics from your Linux and Windows virtual servers. These metrics are useful for gaining operational visibility into the performance and health of your applications, services, and platforms. The Sysdig agent collects these metrics and sends them to Sysdig instance. When you run the IBM Cloud API ETL, these metrics are imported into the BMC Helix Continuous Optimization database.

Collecting Sysdig performance metrics from Linux virtual server

  1. Log in to the virtual server by using your public IP address and root user name.
  1. Provision an instance of the IBM Cloud Monitoring.

    1. One Sysdig service instance must be provisioned for each region. The user creating Sysdig instance must have "IBM Cloud Monitoring" privileges to create Sysdig instance.

      Steps to manage user access
      1. Log in to the IBM Cloud console.

      2. In the IBM Cloud Console header, click Manage > Access (IAM).
      3. From the left navigation page, select Users.
      4. In the Account users table, identify the user to whom you want to assign the access. From the Actions menu of that user, click Assign access.
      5. Select Assign access within a resource group.
      6. Select a resource group.
      7. If the user does not have a role already granted for the selected resource group, choose a role for the Assign access to a resource group field.
        Depending on the role that you select, the user can view the resource group on their dashboard, edit the resource group name, or manage user access to the group. You can select No access, if you want the user to have access only to the IBM Cloud Monitoring in the resource group.
      8. Select IBM Cloud Monitoring.
      9. Select the platform role Administrator.
      10. Click Assign.
    2. Steps to provision an instance of the IBM Cloud Monitoring service

      To add monitoring features with IBM Cloud Monitoring in the IBM Cloud, you need to provision an instance of the IBM Cloud Monitoring service. You provision an instance within the context of a resource group. A resource group lets you organize your services for access control and billing purposes. You can provision the IBM Cloud Monitoring with Sysdig instance in the default resource group or in a custom resource group. When you provision an instance, you automatically get an ingestion key, known as the Sysdig access key.

      1. Log in to the IBM Cloud console.

      2. From the IBM Cloud dashboard, navigate to the menu ibm_cloud_menu.png> Observability to access the Observability dashboard.
      3. Select Monitoring > Options > Create.
      4. Select the region.
      5. Select a service plan. By default, the Trial plan is set. For more information about the service plans, see Service plans.

      6. Enter a service name.
      7. Select a resource group. By default, the Default resource group is set.
      8. Set on automatic collection of platform metrics by clicking Enable.
      9. Click Create to provision an instance.
        The service UI is displayed.

      To provision an instance of Sysdig by using the CLI, see Provisioning a Sysdig instance by using the CLI.

    3. Steps to configure a Sysdig agent

      To configure your Linux host (Ubuntu server) to send metrics to your IBM Cloud Monitoring instance, install a Sysdig agent.

      Complete the following steps from the command line:

      1. Open the terminal.
      2. Run the following command to log in to the IBM Cloud:

        ibmcloud login -a cloud.ibm.com

        Select the account where the IBM Cloud Monitoring instance is available.

      3. Obtain the Sysdig access key.
        1. Log in to the IBM Cloud console..

        2. From the left navigation page, select Observability.
        3. Select Monitoring. The IBM Cloud Monitoring dashboard is displayed. A list of monitoring instances that are available on IBM Cloud are displayed.
        4. Identify the instance for which you want to get the access key. Select actions, then click View Key. A pop up window opens with the key information.
        5. Click the eye icon to view the access key.

          To obtain the access key by using the CLI, see Getting the access key by using the CLI

      4. Obtain the IBM region list. For information, see Regions and endpoints. .

      5. Obtain the region-specific public collector endpoint. For information, see Public collector endpoints. 

      6. Run the following command to deploy the Sysdig agent on the virtual server:
        curl -s https://s3.amazonaws.com/download.draios.com/stable/install-agent | sudo bash -s -- --access_key <SYSDIG_ACCESS_KEY> --collector <COLLECTOR_ENDPOINT> --collector_port 6443 --secure true --check_certificate false --tags TAG_DATA --additional_conf 'sysdig_capture_enabled: false'

        Example:
        command for frankfurt region from our environment: curl -s https://s3.amazonaws.com/download.draios.com/stable/install-agent | sudo bash -s -- --access_key 2cefff44-4cba-4c8d-afc0-a8563ee8049a --collector ingest.eu-de.monitoring.cloud.ibm.com --collector_port 6443 --secure true --check_certificate false --tags type:sysdig-agent,location:frankfurt,sourceType:virtualserver --additional_conf 'sysdig_capture_enabled: false'
      7. Verify the status of the dragent serviceRun the command: systemctl status dragent.service
        dragent_service_status.png

      After the installation is done, check the contents of the /opt/draios/etc/dragent.yaml file. The values of ssl, ssl_verify_certificate, and sysdig_capture_enabled properties must be set to the following:

      • ssl: true 
      • ssl_verify_certificate: false
      • sysdig_capture_enabled: false

      If these values are not correct in the dragent.yml file, set these properties manually and save the file.

      (Optional) To filter the metrics, add the metrics_filter property in dragent.yaml file. For details, see Including and excluding metrics.

      To view the metrics in the IBM Cloud Sysdig UI, launch the Sysdig Web UI. For details, see Launch the Web UI.. In the Host and containers section, you can find the entry for your Ubuntu server.

Collecting Sysdig Performance metrics from Windows virtual server

The Prometheus WMI exporter runs as a Windows service. You can configure the metrics that you want to monitor by enabling the collectors.

The following collectors are supported by IBM:

  • CPU
  • Computer system metrics (cs)
  • Disk metrics
  • Network interface metrics
  1. Configure the Prometheus WMI exporter
    1. Log in to your Windows computer.
    2. Download the Prometheus exporter.
      BMC Helix Continuous Optimization does not support v0.13.0 and later versions of the Prometheus exporter.

    3. Identify the collectors that contain the information for the metric data that you want to send to the Sysdig agent.
    4. Run the wmi_exporter and configure the collectors that you want to enable.
      .\wmi_exporter-0.12.0-amd64.exe --collectors.enabled <COLLECTORS>

      where, <COLLECTORS> indicates the list of connectors that you want to configure
      Example: To collect computer system metrics (cs), CPU metrics, disk metrics, and network interface I/O metrics use the following command:

      .\wmi_exporter-0.10.2-amd64.exe --collectors.enabled "os,cpu,logical_disk,net,system"

    Note: The ETL does not support the latest version of the wmi exporter. Ensure that you download the 0.12.0 version (known as wmi_exporter and not the windows_exporter) of the exporter.

  2. (Optional) Configure the network settings
    1. Enable the Windows firewall to allow access to wmi_exporter-0.12.0-amd64.exe.
    2. (Optional) Update the VPC rules. If you use private endpoints, add an inbound rule to the security group for port 9182 with source type = Security Group and choose the security group for the Windows system.
  3. Collect metrics by running Prometheus as a client collector on Windows

    Use the Prometheus remote-write capabilities to push the metrics from the Windows system by running Prometheus as a client collector on Windows.

    1. Download the Prometheus monitoring system and time series database. Download prometheus-2.15.2.windows-amd64.tar.gz file.

    2. Unzip the prometheus-2.15.2.windows-amd64.tar.gz file.
    3. Edit the prometheus.yml file in a text editor.
    4. Configure the scrape_configs section of prometheus.yml configuration file as follows to have prometheus scrape the Windows wmi_exporter.

       scrape_configs:
        # The job name is added as a label `job=<job_name>` to any timeseries scraped from this configuration.
        - job_name: 'wmi_exporter'

           static_configs:
           - targets: ['localhost:9182']

            labels:
              region: us-east
              instance: <HOSTNAME>
              job: <JOBNAME>

      where,

      • <HOSTNAME> is the name of the Windows system
      • <JOBNAME> is a custom attribute that you can set to identify the role of the node that you are scraping, and you can also use to scope the data in Sysdig
    5. Add the remote_write configuration at the end of the prometheus.yml file to configure the target Sysdig instance that will receive the metrics.

       remote_write:
        - url: "ENDPOINT/api/prometheus/write"

           bearer_token_file: C:\Users\Administrator\prom\sysdig-apikey

           write_relabel_configs:
            # Drop forwarding the metrics generated by the exporter that are not supported
            - source_labels: ["__name__"]
               regex: "^wmi_(.*)"
               action: keep

            - regex: "(__name__)|(job)|(region)|(instance)|(status)|(core)|(name)|(start_mode)|(nic)|(volume)|(state)|(version)|(mode)|(branch)|(timezone)|(goversion)|(collector)|(revision)"
               action: labelkeep


      where,

      • ENDPOINT is the Sysdig collector endpoint. For the list of endpoints, see Sysdig Collector endpoints.

      • sysdig-apikey is the file that contains the Sysdig Monitor API Token. The file name does not have an extension.
        For information about how to get the API token, see Getting the Sysdig API token.
        Example: Completed version of the prometheus.yml

         # my global config
        global:
           scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
           evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
          # scrape_timeout is set to the global default (10s).

        # Alertmanager configuration
         alerting:
           alertmanagers:
          - static_configs:
            - targets:
              # - alertmanager:9093

        # Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
         rule_files:
          # - "first_rules.yml"
          # - "second_rules.yml"

        # A scrape configuration containing exactly one endpoint to scrape:
        # Here it's Prometheus itself.
         scrape_configs:
           # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
           - job_name: '
        wmi_exporter'

             static_configs:
             - targets: ['
        localhost:9182']

               labels:
                 instance: "my-windows-hostname"
                 region: "us-south"

        # Connection to sysdig
         remote_write:
          - url: "https://ingest.eu-gb.monitoring.cloud.ibm.com/api/prometheus/write"

             bearer_token_file: C:\Users\Administrator\prom\sysdig-api

             write_relabel_configs:
              - source_labels: ["__name__"]
                 regex: "^wmi_(.*)"
                 action: keep

              - regex: "(__name__)|(job)|(region)|(instance)|(status)|(core)|(name)|(start_mode)|(nic)|(volume)|(state)|(version)|(mode)|(branch)|(timezone)|(goversion)|(collector)|(revision)"
                 action: labelkeep
    6. Start the Prometheus executable from the location containing the prometheus.yml file. Run .\prometheus.exe.
  4. To monitor Windows systems metrics, use the default dashboard Windows Node Overview to view the Windows metrics. This default dashboard is located in the Hosts and Containers section.
  5. (Optional) Verify the uptime for Windows with Prometheus Blackbox exporter. For details, see Verifying uptime for Windows with Prometheus Blackbox exporter.

Metrics provided for Linux systems

Metrics provided for Windows systems

Troubleshooting Sysdig installation failure in Linux

If the Sysdig installation fails, perform the following tasks based on your operating system:

Type of error

Troubleshooting steps

Kernel headers are not available

Install the kernel headers manually

For Debian or Ubuntu Linux distribution

  1. Select a distribution (cat /etc/os-release)
  2. Run the following command for the selected distribution:
    apt-get -y install linux-headers-$(uname -r)
  3. If the error still persists, run the following command:
    yum install kernel kernel-headers
  4. Deploy the Sysdig agent

For RHEL, CentOS, and Fedora Linux distributions

  1. Select a distribution (cat /etc/os-release)
  2. Run the following command for the selected distribution:
    yum -y install kernel-devel-$(uname -r)
  3. If the error still persists, run the following command:
    yum install kernel kernel-headers
  4. Deploy the Sysdig agent

sysdig-probe kernel module is not installed on the kernel

  1. Install the kernel module using the following command:
    yum install kernel kernel-headers
  2. Deploy the Sysdig agent

Installation fails because the dkms_autoinstaller service is stopped

  1. Use the following commands to start the service:
    sudo yum -y install kernel-devel-$(uname -r)
    sudo /usr/lib/dkms/dkms_autoinstaller start
  2. Deploy the Sysdig agent

The kernel packages are not available

  1. Run the following command to get the names of the packages that are not available. The package names are available in the error that is generated after running the following command:
    yum -y install kernel-devel-$(uname -r)
  2. Download the package from the https://rpmfind.net/linux/rpm2html/search.php?query=kernel-devel-x86_64 using the wget command
  3. Install the missing package:
    sudo yum localinstall <RPM file name>
  4. Install the kernel headers:
    sudo yum -y install kernel-devel-$(uname -r)
    yum install kernel kernel-headers

If the Sysdig agent installation still fails, check the logs at /opt/draios/etc/draios.log and raise support case with the IBM Support team.

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*