GPU Metrics In NVIDIA Summary Datamart


The NVIDIA Summary Datamart stores aggregated GPU metrics collected via the dcgm-exporter integration. Metrics are associated with the SYSGPU dataset and support monitoring, performance analysis, and capacity planning. Metrics collected here are imported from CSV files generated by the scripts in Collecting GPU Data Using NVIDIA dcgm-exporter.


Metrics List

MetricStatistic
BYGPU_CORRECTABLE_REMAPPED_ROWSSUM
BYGPU_GPU_DECODER_UTILPCT95
BYGPU_GPU_ENCODER_UTILPCT95
BYGPU_GPU_MAX_OP_TEMPLAST_VALUE
BYGPU_GPU_TEMPMAX
BYGPU_GPU_UTILPCT95
BYGPU_INDEXVALUE
BYGPU_MEM_CLOCK_MHZPCT95
BYGPU_MEM_COPY_UTILPCT95
BYGPU_MEM_FREEMIN
BYGPU_MEM_MAX_OP_TEMPMAX
BYGPU_MEM_RESERVEDPCT95
BYGPU_MEM_TEMPMAX
BYGPU_MEM_USEDPCT95
BYGPU_MEM_UTILPCT95
BYGPU_MODELVALUE
BYGPU_NAMENone
BYGPU_PCIE_RETRIESSUM
BYGPU_PWR_UTILAVG
BYGPU_ROW_REMAP_FAILUREAVG
BYGPU_SHUTDOWN_TEMPLAST_VALUE
BYGPU_SLOWDOWN_TEMPLAST_VALUE
BYGPU_SM_CLOCK_MHZPCT95
BYGPU_TOTAL_NVLINK_BANDWIDTHSUM
BYGPU_TOTAL_REAL_MEMLAST_VALUE
BYGPU_UNCORRECTABLE_REMAPPED_ROWSSUM
BYGPU_UUIDVALUE
BYGPU_VGPU_LICENSE_STATUSAVG
BYGPU_XID_ERRORAVG

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*

BMC Helix Continuous Optimization 25.4