SageMaker Variant (AWS_SAGEMAKER_VARIANT)

The application class provides data about project instances.

Application class or Monitor type details

Item	Description
Active	Yes
Created by	Discovery
Parent class	SageMaker-Endpoint-AWS_SAGEMAKER_ENDPOINT
Child class	None

Click here to view configuration details when using the TrueSight console.

Additional details when using the TrueSight console

Click here to view configuration details for monitoring policies

Each tab provides options for a specific configuration. Depending on what you want to monitor, you must specify values in the respective tabs.
The following table lists the tabs and the configurations that you can do in them:

Tab	Supported Configuration
Monitoring	To add and configure monitor types for the compatible PATROL monitoring solutions that are located in the deployable packages.
Filtering	To control the data and events that are sent to the Infrastructure Management server from the PATROL Agents.
Polling Intervals	To configure the time interval between two consecutive data polls.
Agent Thresholds	To configure intelligent thresholds for attributes of a monitor type on the PATROL Agents.
Server Thresholds	To configure thresholds for the monitor instances on the Infrastructure Management server.
Agent	To configure the properties of a PATROL Agent and specify the action that the Agent must perform when the policy is applied.
Server	To specify actions to be performed on the Infrastructure Management server when the policy that is created, is applied
Configuration Variables	To control the PATROL Agent configuration by defining values for the Agent's configuration variables.

Add and configure monitor types to a monitoring solution

You can add and configure monitor types for the compatible PATROL monitoring solutions that are located in the Deployable Package Repository. For a list of monitoring solutions and monitor types that you can configure, see Monitoring Solutions and KMs. To configure custom monitoring solutions, ensure the solution is structured correctly; for details, see Developing a PATROL Knowledge Module.

Ensure that the Monitoring tab is selected. By default, this tab is selected. If not, click it.
Click New Monitor Configuration.

In the Add Monitor Configuration dialog box, configure the following properties:

Property	Description
Monitoring Solution	Name of the monitoring solution. A solution can have many Knowledge Modules (KMs) under it.
Version	Version of the selected monitoring solution
Monitor Profile	Name of the monitor profile to which the monitor types that you want to enable are associated. Each solution contains multiple monitoring profiles that help to reduce unnecessary monitoring. Each monitoring profile is associated with a group of monitor types. If a profile has monitor types that require you to enter information, they are listed in the Monitor type list, and you can configure them. If the profile has no monitor types that require input, all the monitor types are enabled by default and the Monitor Type list is disabled. To view all the monitor types that belong to a selected profile, click the Click to view Monitor Types option. The monitor types that belong to a profile are pre-determined. You cannot add or remove them from a profile.
Monitor Type	Name of the monitor type that you want to configure. You can select one from the list: All - All the monitor types will have default values. Monitor type other than All - You must configure the monitor type. The configuration options vary depending on the monitor type that you select. For documentation or more information, click the Help icon next to the list. The online documentation for that monitor type is displayed.

Click OK. The configured monitor type is added as a new row in the Monitoring page.
To continue adding configuration details, repeat the previous steps for a different monitor type, version, profile, or solution.
To modify a monitor configuration, click the action menu for that configuration and select Edit. Modify the configuration and click OK.
To delete a monitor configuration, click the action menu for that configuration and select Delete. After you save the policy, the deleted monitor type configuration is removed from the selected PATROL Agents.

After you save or update a policy, the monitor type configurations are pushed to the selected PATROL Agents.

Configure filters to include or exclude data and events

After you configure the monitor types in the PATROL Agents, they send the collected data and the generated events to the Infrastructure Management server. You can configure filters to control the data and events that are sent to the Infrastructure Management server per parameter.

You can filter the data either from a specific Agent or from a specific monitor type. The supported filtering options are as follows:

Send events only
Send trended performance data only
Send trended performance data and events
Send no trended performance data and no events

Best practice

BMC does not recommend that you send both trended performance data and events for the same parameter to the Infrastructure Management server. If you are sending trended performance data for a parameter, configure thresholds in the Infrastructure Management server for that parameter and let the Infrastructure Management Sever generate events for it.
Certain types of parameters must have only trended performance data sent to the Infrastructure Management Server from the PATROL Agents; they must not have events sent. Other types of parameters must only have events sent to the Infrastructure Management server from the PATROL Agents, with no performance data sent.
In Infrastructure Management version 9.6 and later, parameters need not be collected and stored in the Infrastructure Management server to be trended and visible in the Server. Based on parameter usage, ensure that you store data for and configure the following parameters in the Infrastructure Management server database:
- KPI parameters
- Parameters required in performance reporting
- Parameters requiring “duration” thresholds. For example, you do not want an event unless the parameter has breached a threshold for 15 minutes. (PATROL Agent does not support this capability.)
- Parameters requiring “time of day” type thresholds. (This is accomplished using baselines.)
- Parameters for which predictive event generation and abnormality detection are required. This generally applies to all KPIs, which can be extended.

Important

A PATROL Agent applies a filtering policy during the next scheduled discovery or collection of a PATROL object (monitor type or attribute). That is, filtering is applied when the first collection occurs on the PATROL object after filtering rules are applied. Due to this behavior, there might be a delay in the deployment of the filtering rules on the PATROL object.

Click the Filtering tab.
Select one of the following options:
- Agent Level: Filters data and events from the PATROL Agent. Select an appropriate filtering option from the list. The default option is No Filtering (send all data and events).
- Monitor Type Level: Filters information from a specific monitor type. To specify the monitor type:
  1. Click Add Monitor Type.
  2. In the Add Monitor Type dialog box, select a solution and its version.
  3. Select a related monitor type from the list.
  4. If you want to specifically filter data and events from an attribute, select an attribute from the list. The default value is All.
  5. Select one of the following filtering option:
    - No Filtering (send all data and events)
    - Filter out data (send no data, send all events) - Default option
    - Filter out events (send no events, send all data)
    - Filter out data and events (send no data, send no events)
  6. Click OK. The selected monitor type row is added to the table in the Filtering page.
    To add multiple monitor types for filtering data and events, repeat the previous steps.
To modify the properties of an existing filter level, click the action menu for that filter and click Edit.
To delete a configured monitor type level, click the action menu for that level and click Delete.

Configure polling intervals

Specify the time interval between two consecutive data polls. You need to specify a separate poll interval for every parameter.

Click the Polling Intervals tab.
Add a polling interval:
1. Click Add Polling Interval.
2. Select values for the following options:
  - Monitoring Solution
  - Version
  - Monitor Type
  - Parameter
  - Parameter Polling Interval
3. Click OK. The polling interval is added to the table in the Polling Interval page.
To add more polling intervals, repeat the previous step.
To modify the properties of a polling interval: Click the action menu associated with that polling interval, and select Edit.
To delete a polling interval: Click the action menu associated with that polling interval and select Delete.

Define thresholds on PATROL Agents

Configure range-based thresholds for attributes of a monitor type on the PATROL Agents. You can specify whether the thresholds apply to a monitor type or to an instance of a monitor type.

Best practice

Define and specify global thresholdsfor KPIs and performance parameters in the Infrastructure Management server before you define and enter server and Agent thresholds in the monitoring policies.

Click the Agent Threshold tab.
Click Add Agent Threshold.
The [confluence_table-plus] macro is a standalone macro and it cannot be used inline.
Click OK. The threshold is added to the table on the Agent Threshold page.

Create server-level thresholds

You can create thresholds for the monitor instances on the Infrastructure Management server. For information about the types of thresholds, see Thresholds, KPIs, and baselines.

Best practice

Ensure that you complete the monitoring configuration before you configure the Infrastructure Management server thresholds because thresholds cannot be applied to monitors that are not configured.
Define and specify global thresholdsfor KPIs and performance parameters in the Infrastructure Management server before you define the server and agent thresholds. Server thresholds override global thresholds.
Use server thresholds for instance-level thresholds.

To configure the server thresholds:

Click the Server Threshold tab.
Click Add Server Threshold.
Specify values for the following properties:
The [confluence_table-plus] macro is a standalone macro and it cannot be used inline.
Click OK.
To modify parameters of a threshold, click the action menu for that threshold and select Edit.
To remove a threshold, click the action menu for that threshold and select Delete. After you save the policy, the deleted threshold configurations are removed from the selected PATROL Agents.
After you save or update a policy, the new threshold configurations are pushed to the selected PATROL Agents.

Configure PATROL Agents

You can configure the properties of a PATROL Agent and specify the action that the Agent must perform when the policy is applied.

For more information, see Specifying objects in an authorization profile.

Click the Agent tab.
Specify the following properties:

Property	Description
Agent Default Account
User Name	The user name that you want to use to run the PATROL Agent. The user name must have access permissions to the PATROL Agent directory. Warning Important You need to provide an Agent user name and password only if they are different from what was previously configured.
Password	Password for the specified user name
Confirm Password	Password for the specified user name
Restart Agent	Select this check box if you want to restart the PATROL Agent after the policy is applied. Warning Important You must select this check box if you want the user name change to take effect. While editing a policy, if you select Restart Agent and provide the same default account information as what was provided while creating the policy, the PATROL Agent is not restarted. To restart the PATROL Agent, use the Query Agent screen. For more information, see Performing actions on a PATROL Agent

Tag	Tags that you want to assign to the PATROL Agent. The format of each tag is tagName:tagDescription. If the tag description contains spaces, enclose the description within double quotes. You can also provide multiple tags, separated by commas. For example, tag1:"Brief Description",tag2:"Description".
Integration Service	An Integration Service or an Integration Service cluster that you want to set as the PATROL Agent phone home. The supported values are as follows: Single Integration Service Integration Service Cluster
Event Configuration Properties
Event Forwarding Destination	Destination where the events that are generated by the PATROL Agents must be sent to. The supported destinations are as follows: Integration Service Settings: Enables you to use the settings on the Integration Service to which the PATROL Agent is connected Cells: Enables to send the events to one or more cells. In the Event Cell List box, specify the cells to which you want to send the events, separated by commas. The format of each cell is cellName/cellPort, where cellName is the name of the cell and the cellPort is the port number through which the agent communicates with the cell. For example, cell1/1111,cell2/1203 Also, in the Event Cell Shared Key box, specify the shared key which is used for encryption. The default shared key is mc. Do not send events: Stops sending the events
Event Format Container	String to append to the name of the events that are generated by the PATROL Agents. The default string is BiiP3.

Configure actions performed by Infrastructure Management server

Specify actions to be performed on the Infrastructure Management server when the policy is applied. The actions apply to all devices that are associated with the PATROL Agent and all the monitor instances that the Agent monitors.

Best practice

Do not use the automated group creation functionality excessively. Plan the groups that you need and configure accordingly.
Use the copy baseline feature only when you know the existing baseline is appropriate for a new Agent or device. For example, if you are adding an additional server to an Apache web server farm behind a load balancer where the new server has exactly the same configuration as the other servers in the farm (OS version, machine sizing and type, Apache version, Apache configuration) and the new Apache web server processes exactly the same types of transactions for the same application. If you are not certain, do not use the copy baseline feature.

Click the Server tab.
Specify values for the following properties:

Property	Description
Add Agent Monitors to Group	Name of a group on the Infrastructure Management server under which you want to add the monitors that are created as a result of this policy. You can provide multiple group names, separated by commas. If a group does not exist, a new group is created. Use a forward slash (/) to specify a group in a hierarchy. For example, server/device1/myGroup. If more than one group exists with the same name and a distinct hierarchy is not specified, monitor instances are not added to the group. The group names are case sensitive.
Add Associated Devices to Group	Select the check box to add all devices associated with the Agent monitors to the group specified in the Add Agent Monitors to Group field. If multiple groups are specified in the Add Agent Monitors to Group field, all the associated devices are added automatically to all the groups. Warning Important This field is applicable and is functional only when you have a TrueSight Infrastructure Manager Server 10.5 or later in your environment. The devices from TrueSight Infrastructure Manager Server 10.1 and 10.0 are not added to the group even if this flag is selected.
Copy Baseline from Device	Display name of the device from which you want to copy the baseline. Baseline for the monitor instances that are created on the child servers as a result of the policy are copied from the baseline of the corresponding monitor instances on the specified device. Baseline can be copied from the specified device only if the device is available on the child server. The device names are case sensitive.
Associate Authorization Profiles with Devices	Name of the user group on the child servers to associate with the devices that are created as a result of the policy. Devices are associated with the specified user group only if the user groups are available on the child server. The user groups are given both read and write permissions to the Authorization Profile. You can provide multiple group names, separated by commas. The user group names are case sensitive.

Define and manage configuration variables

You can define individual configuration variables or import them from a ruleset file (.cfg).

The PATROL Agent configuration is saved in a set of configuration variables that are stored in the Agent's configuration database. You can control the PATROL Agent configuration by changing the values of these configuration variables. Also, you can define a configuration variable, and the definitions are set on PATROL Agent when the policy is applied.

Important

To view the configuration variables that are available in the previous PATROL Agent versions, use the Query Agentfunctionality.

If you are modifying the default Agent configuration, you must restart the PATROL Agent to reflect the changes.

To import existing configuration variables
To add new configuration variables

Best practice

Avoid creating a policy with both monitoring configuration and a configuration variable. You can create separate policies for monitoring configuration and configuration variables.
To keep the PATROL Agent in sync with the policy configuration, change an existing configuration variable's operation to DELVAR, instead of deleting it. After a configuration variable is deleted from the policy, you cannot perform any actions on it.

To import existing configuration variables

In the Configuration Variable page, click the common action menu in the table and select Import.
Browse for and select the configuration file (.cfg) to be imported.
Click Open. The variables from the file are added to the table.

Important

The import operation supports only REPLACE, DELETE, and DELVAR operators. If the .cfg file contains the MERGE or APPEND operators, the file cannot be imported. You must delete these operators before importing the file.

To add new configuration variables

Click Add Configuration Variable.

In the Add Configuration Variable dialog box, specify values for the following properties and click OK:

Property	Description
Variable	Name for the configuration variable. Important: The variable names are case sensitive and must start with slash (/). You must enter the complete path for the variable, not just the name. If the intermediate variables do not exist, the variables are created.
Operation	Operation that you want to perform on the configuration variable. The supported operations are as follows: REPLACE: Replaces the current value of the variable on the PATROL Agent if the variable already exists. Otherwise, a new variable is created with the specified value. DELVAR: Deletes the variable from the PATROL Agent. DELETE: Deletes the value of the variable, and the variable remains empty.
Value	Value that you want to configure for the variable and click OK. The configuration variable is added as a new row in the Configuration Variables table.

To add another configuration variable, repeat the earlier steps.

Important: For the defaultAccount configuration variable, specify the value in the userName/password format. Note that the password can be a plain text or a PATROL Agent-encrypted string.
Examples:
patrol/patAdm1n
patrol/FA4E70ECEAE09E75A744B52D2593C19F
For the SecureStore configuration variable, specify the value in the context/data format. Note that the context and data can be a plain text or a PATROL Agent-encrypted string.
Examples:
MY_KM1;MY_KM2;MY_KM3/mysecretdata
“EDC10278901F8CB04CF927C82828595B62D25EC355D0AF38589CE4235A246F8C63F24575073E4ECD”
where “EDC10278901F8CB04CF927C82828595B62D25EC355D0AF38589CE4235A246F8C63F24575073E4ECD” is the encrypted form of "MY_KM1;MY_KM2;MY_KM3/mysecretdata"
To modify any value in a variable, click the action menu for the variable and select Edit.

In the Edit Configuration Variable dialog box, modify the properties and click OK.
To remove a variable, click the action menu for the variable and select Delete.

Did you know?

Monitor types in the TrueSight console and Central Monitoring Administration are known as application classes in the PATROL consoles.

Attributes in Central Monitoring Administration are known as parameters in the PATROL consoles.

Parameter Reference Database (PRD)

The attribute information is also available in downloadable format in the Parameter Reference Database (PRD).

The PRD contains reports of the parameters available from the KMs and solutions in the extensive Knowledge Module (KM) library.

You can access these reports in CSV, PDF, or HTML format.

Attributes (parameters)

The following attributes are available for this monitor type:

Name	Description	Unit	Default Performance Key Indicator (KPI)
CPU Utilization (CPUUtilization)	The sum of each individual CPU core's utilization. The CPU utilization of each core range is 0–100. For example, if there are four CPUs, the CPUUtilization range is 0%–400%. For processing jobs, the value is the CPU utilization of the processing container on the instance. For endpoint variants, the value is the sum of the CPU utilization of the primary and supplementary containers on the instance.	%	No
Disk Utilization (DiskUtilization)	The percentage of disk space used by the containers on an instance uses. This value range is 0%–100%. This metric is not supported for batch transform jobs. For endpoint variants, the value is the sum of the disk space utilization of the primary and supplementary containers on the instance.	%	No
GPU Memory Utilization (GPUMemoryUtilization)	The percentage of GPU memory used by the containers on an instance. The value range is 0–100 and is multiplied by the number of GPUs. For example, if there are four GPUs, the GPUMemoryUtilization range is 0%–400%. For endpoint variants, the value is the sum of the GPU memory utilization of the primary and supplementary containers on the instance.	%	No
GPU Utilization (GPUUtilization)	The percentage of GPU units that are used by the containers on an instance. The value can range betweenrange is 0–100 and is multiplied by the number of GPUs. For example, if there are four GPUs, the GPUUtilization range is 0%–400%. For endpoint variants, the value is the sum of the GPU utilization of the primary and supplementary containers on the instance.	%	No
Memory Utilization (MemoryUtilization)	The percentage of memory that is used by the containers on an instance. This value range is 0%–100%.For endpoint variants, the value is the sum of the memory utilization of the primary and supplementary containers on the instance.	%	No
Loaded Model Count (LoadedModelCount)	The number of models loaded in the containers of the multi-model endpoint. This metric is emitted per instance. The models that this metric tracks are not necessarily unique because a model might be loaded in multiple containers at the endpoint.	#	No
Model Cache Hit (ModelCacheHit)	The number of InvokeEndpoint requests sent to the multi-model endpoint for which the model was already loaded.	#	No
Model Loading Time (ModelLoadingTime)	The interval of time that it took to load the model through the container's LoadModel API call.	ms	No
Model Downloading Time (ModelDownloadingTime)	The interval of time that it took to download the model from Amazon Simple Storage Service (Amazon S3).	ms	No
Model Unloading Time (ModelUnloadingTime)	The interval of time that it took to unload the model through the container's UnloadModel API call.	ms	No
Model Loading Wait Time (ModelLoadingWaitTime)	The interval of time that an invocation request has waited for the target model to be downloaded, or loaded, or both in order to perform inference.	ms	No
Model Setup Time (ModelSetupTime)	The time it takes to launch new compute resources for a serverless endpoint. The time can vary depending on the model size, how long it takes to download the model, and the start-up time of the container.	ms	No
Overhead Latency (OverheadLatency)	The interval of time added to the time taken to respond to a client request by SageMaker overheads. This interval is measured from the time SageMaker receives the request until it returns a response to the client, minus the ModelLatency. Overhead latency can vary depending on multiple factors, including request and response payload sizes, request frequency, and authentication/authorization of the request.	ms	No
Model Latency (ModelLatency)	The interval of time taken by a model to respond as viewed from SageMaker. This interval includes the local communication times taken to send the request and to fetch the response from the container of a model and the time taken to complete the inference in the container.	ms	No
Invocations Per Instance (InvocationsPerInstance)	The number of invocations sent to a model, normalized by InstanceCount in each ProductionVariant. 1/numberOfInstances is sent as the value on each request, where numberOfInstances is the number of active instances for the ProductionVariant behind the endpoint at the time of the request.	#	No
Invocations(Invocations)	The number of InvokeEndpoint requests sent to a model endpoint.	#	No
Invocation5XXErrors (Invocation5XXErrors)	The number of InvokeEndpoint requests where the model returned a 5xx HTTP response code. For each 5xx response, 1 is sent; otherwise, 0 is sent.	#	No
Invocation4XXErrors (Invocation4XXErrors)	The number of InvokeEndpoint requests where the model returned a 4xx HTTP response code. For each 4xx response, 1 is sent; otherwise, 0 is sent.	#	No

SageMaker Variant (AWS_SAGEMAKER_VARIANT)

Add and configure monitor types to a monitoring solution

Configure filters to include or exclude data and events

Configure polling intervals

Define thresholds on PATROL Agents

Create server-level thresholds

Configure PATROL Agents

Configure actions performed by Infrastructure Management server

Define and manage configuration variables

To import existing configuration variables

To add new configuration variables

Attributes (parameters)

BMC PATROL for Amazon Web Services 22.4

On this page