The Database Server houses the TrueSight Capacity Optimization data warehouse (DWH).The primary purpose of the TrueSight Capacity Optimization DWH is the collection of historical time series.
Near-Real-Time Warehousing is a process that stores, organizes and calculates statistics for the collected data. This service is always running inside the Data Hub.
Data collected by ETL tasks is not available to the analysis module before it is processed by the warehousing engine.
Data flow reports allow you to keep TrueSight Capacity Optimization imports under control, as the number of rows processed per day is a good indicator of the system's health.
When performing historical imports (for example, if you are planning to bulk load more than two million rows), it is strongly recommended to split the data in smaller chunks, limiting the amount of information processed at one time. This prevents congestion in the Near-Real-Time Warehouse engine.
The data warehouse controls the following activities:
- Data aggregation
- ETL tasks collect data into a stage table.
- The warehousing engine calculates aggregations, and splits data in summary tables at different time resolution levels (detail, hour, day, month).
- Data is aggregated based on hierarchical rules (derived rows); this process is called hierarchical data aggregation.
- Data classification
The day and hour class definitions are used to classify data as specified by the Calendar.
- Data aging
Data classified as old by customizable aging parameters is deleted from the warehouse tables.
- Custom statistics
Additional statistics to perform calculations on data series can be created (For example, percentiles or baselines).
The DWH information model
The TrueSight Capacity Optimization Data Warehouse houses the following types of data:
- Time series (TS): Performance or business driver data that represents a metric over time, with two subtypes:
- Sampled, that is metric samples, at regular time intervals
- Delta records
- Custom structures (CS): Data that represents generic records with custom attributes, with two subtypes:
- Buffer tables - contain data that is copied into TrueSight Capacity Optimization for further processing, but is generally not important for direct analysis
- Item-level detail tables - contain data that represents the details of an item that are important for analysis purposes such as, errors for a specific page
- Object relationships (OBJREL): Data that represents relationships between entities
- Events (EVDAT): Data that represents events
Time series and measured objects
A time series is a sequence of samples or statistics for a certain measurement, each corresponding to a point in time. The TrueSight Capacity Optimization DWH contains both time series samples and statistics, aggregated at different time resolution (hour, day, month). All time series are associated with a measured object, described according to a reference model. The following figure provides a basic illustration of the reference model for measured objects.
Reference model for measured objects
The model comprises of the following components:
- An entity represents a single system (for example, a database instance) or a business driver, that is the load a given application undergoes. For example, the load of an FTP server. Refer to Working with domain entities for details.
- An object is a metric for a system resource or a business driver for which data is collected. For example, the CPU utilization percentage of a server or the FTP transfer bit rate.
- The location tracks the physical location from which a metric was observed. For example, the FTP transfer bit rate when a file is downloaded from New York or from Milan.
- A subobject represents finer details of a metric. For example, a metric measuring the free space of a disk could have details about the free space of each disk partition as its subobjects.
Each available object/metric has a standard name which adheres to the naming convention defined in the datasets. For a complete listing, refer to TrueSight Capacity Optimization ETL Development Studio (also see Developing custom ETLs), or to the Viewing datasets and metrics by dataset and ETL module page in the Data Warehousing menu of the Administration section.
Each metric has a type which defines the measure unit and how statistics about that data should be collected. A complete listing is as follows:
Generic counter, absolute value
A count of events, absolute number
A frequency, in events/sec
Positive accumulation counter
Configuration data (string)
Negative accumulation counter
Elapsed time, in seconds
Percentage counter, weighted
Peak Percentage counter
Peak frequency, in events/sec
Difference between subsequent samples
Generic counter, absolute value, weighted
Measurement units and formats also use common standards:
from 0 to 1
How the DWH handles non-standard data sources
TrueSight Capacity Optimization uses ETL tasks to import data that is collected by third party applications or logged by the monitored entities themselves. This data, from a TrueSight Capacity Optimization perspective, is persistent because another application takes care of its collection and storage. In a second phase, TrueSight Capacity Optimization accesses this data and imports it into its own data warehouse.
Standard data imported by TrueSight Capacity Optimization is always in the form of a time series, which means that it is described by a value (numeric or textual, for configuration properties) which changes over time.
The Data Mover components are used by TrueSight Capacity Optimization to deal with data which does not respect the two aforementioned properties. The Data Mover is able to access data which is not a time series and store it into TrueSight Capacity Optimization.
The workflow of these components is displayed in the following image.