Amazon Web Services - Cost and Usage Extractor

Use the Amazon Web Services - Cost and Usage Extractor to collect the cost and usage data of the virtual machines (EC2 instances) that are provisioned in the Amazon Web Services (AWS) cloud. TrueSight Cloud Cost Control uses this resource usage and cost data to provide forecasting, simulated migration, and cost estimations. 

This ETL works in conjunction with the Amazon Web Services - AWS API Extractor. The cost and usage data that is collected is associated with the entities and business services that the AWS API ETL collects.

To learn more about the collection of business service data by ETLs, see Collecting business service data.

Information

In the first couple of days of a month, there might be a data latency of up to two days for the cost and usage data to be available for collection.

Collecting data by using the AWS Cost and Usage ETL

To collect data by using the AWS Cost and Usage ETL, do the following tasks:

I. Complete the preconfiguration tasks.

II. Configure the ETL.

Step I. Complete the preconfiguration tasks

The ETL requires the following information to connect to AWS and collect data:

  • Access key and secret key of the newly-created IAM account
  • S3 bucket name
  • Name of the daily billing report and its prefix
  • Business service tag key

To fetch these details, complete the following preconfiguration tasks. If you have multiple AWS accounts, the owner of the master AWS account must perform the preconfiguration tasks.

StepDetails

Create an S3 bucket to store the daily billing reports of your AWS resources that are generated by AWS.

Amazon S3 is a repository to store data objects in the AWS cloud. Buckets are containers for data objects in Amazon S3. Therefore, you must first create a bucket and upload your data objects to the bucket. You can create multiple buckets to store related data objects.
    1. Log in to the Amazon S3 console at https://console.aws.amazon.com/s3/.
    2. Click Create bucket.

    3. On the Name and region page, configure these properties:
      1. In the Bucket Name box, type a name for your bucket.
        Ensure that the name conforms to the bucket naming guidelines. For more information, see Rules for bucket naming Open link .
      2. From the Region list, select a region for the bucket.
      3. (Optional) From the Copy settings from an existing bucket list, select the bucket. The settings of this bucket will be applied to the bucket that you are creating.
      4. If you have copied the settings from your existing bucket, click Create. Else, click Next.
    4. (Optional) On the Set properties page, enable the following properties. By default, these properties are disabled.
      1. Versioning for the objects in your bucket.
      2. Logging to track details of access requests to the data objects in the bucket.
      3. Tags to organize costs according to projects in the billing report. To add tags, click Add tag, and specify a key-value pair for the tag.
      4. Collection of the object-level API activity by using CloudTrail data events.
      5. Encryption of data objects that will be stored in the bucket.
      6. Click Next.

    5. On the Set permissions page, grant the following permissions:
      1. Bucket owner for managing objects in the bucket
      2. (Optional) Other AWS accounts for managing objects in the bucket
      3. (Not recommended) General public for accessing objects in the bucket
      4. (Optional) Amazon S3 Log Delivery group for accessing objects in the bucket
    6. On the Review page, verify the configuration settings, and click Create bucket. To change a setting, click Edit corresponding to the page where you want to make changes.

For more information about creating an S3 bucket, see  Creating a bucket Open link .

Configure an AWS IAM user account with specific privileges to access billing reports from the S3 bucket.

If you already have an IAM user account with the necessary permissions to access S3, you can use the access key ID and the secret key of this user during ETL configuration. In such a case, you can skip this step.


    1. Open the IAM console and sign in with your AWS account credentials: https://console.aws.amazon.com/iam/
    2. Click Users > Add user.

    3. On the Add user page, configure the following properties:
      1. In the User Name box, type a name for the IAM user.
      2. Under Select AWS access type, select the Programmatic access check box.
      3. Click Next: Permissions.
    4. Click Create group.

    5. On the Create group page, specify these details:
      1. In the Group name box, type a name for the group.
      2. From the list of policies, select the check box corresponding to the AmazonS3ReadOnlyAccess policy.
      3. Click Create group.
        The group is created, and the specified user is added to this group.
    6. Click Next Review, review the configured settings, and then click Create user. The user is created with permissions to access the S3 bucket.
    7. Review the specified configuration settings, and click Create user. The user is created with permissions to access the S3 bucket.
    8. Note down the access key ID and the secret access key.

      Tip

      Click Download.csv to download the access key ID and the secret key of the newly created user.

Grant permissions to the S3 bucket to store the AWS Cost and Usage report from AWS.
    1. Log in to the Amazon S3 console at https://console.aws.amazon.com/s3/.
    2. From the list of buckets, select the S3 bucket where you want to store the report.
    3. Click Permissions > Bucket Policy, and add the following code in the Bucket policy editor:

      {
        "Version": "2012-10-17",
        "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "AWS": "386209384616"
          },
          "Action": [
            "s3:GetBucketAcl",
            "s3:GetBucketPolicy"
          ],
          "Resource": "arn:aws:s3:::bucketname"
        },
        {
          "Effect": "Allow",
          "Principal": {
            "AWS": "386209384616"
          },
          "Action": "s3:PutObject",
          "Resource": "arn:aws:s3:::bucketname/*"
        }
        ]
      }
    4. Replace bucketname with the name of your bucket. Do not change the Principal number 386209384616. AWS uses it to send reports to your bucket.
    5. Click Save.

Schedule the AWS Cost and Usage report to be generated daily. After you schedule the report generation, it becomes available for collection from the next day.

The AWS Cost and Usage report provides information about the usage of your AWS resources and the estimated cost for the usage. The report contains the details, such as AWS services that are used, the duration of usage, the amount of data transfer, and the used storage space.

If you use the consolidated billing feature, the report is available only to the master account that includes the cost and usage details of the member accounts associated with the master account.


    1. Log in to the Amazon S3 console: https://console.aws.amazon.com/s3
    2. Open the Billing and Cost Management console: https://console.aws.amazon.com/billing/
    3. Click Reports > Create report.

    4. On the Select Content page, configure the following properties:
      1. Report name: Type a name for the report.
      2. Time unit: Select Daily to aggregate report data every day.
      3. Include: Select the Resource IDs check box to associate the resources with the business services.
      4. Enable support for: Select whether you want to upload the report to Amazon Redshift or Amazon QuickSight.
      5. Click Next.
    5. On the Report details page, configure the following properties:
      1. In the S3 bucket box, type the name of your S3 bucket where you want the reports to be delivered, and click Verify to check whether the bucket has appropriate permissions to store the reports.
      2. In the Report path prefix box, type the prefix that you want to append to the report name.
      3. Click Next.
    6. Review the settings, and click Review and Complete.

For more information, see Turn on daily reports Open link .

The ETL needs to access the specific API endpoints. If your setup is behind a firewall, enable the access to these endpoints.
  • http://<region>.amazonaws.com/
  • http://monitoring.<region>.amazonaws.com/

Where <region> is one of the regions in AWS. For more information about regions, see Regions and Availability Zones Open link .

Important

The ETL requires access to all regions even if your Amazon instances are provisioned in some of the regions.

Step II. Configure the ETL

You must configure the ETL to connect to AWS for collecting the cost and usage data of AWS entities. ETL configuration includes specifying the basic and optional advanced properties. While configuring the basic properties is sufficient, you can optionally configure the advanced properties for additional customization.

A. Configuring the basic properties

Some of the basic properties display default values. You can modify these values when required.

To configure the basic properties:

  1. In the TrueSight Capacity Optimization console, navigate to Administration ETL & System Tasks, and select ETL tasks.
  2. On the ETL tasks page, click Add > Add ETL. The Add ETL page displays the configuration properties. You must configure properties in the following tabs: Run configuration, Entity catalog, and Amazon Web Services Connection

  3. On the Run Configuration tab, select Amazon Web Services - Cost and Usage Extractor from the ETL module list. The name of the ETL is displayed in the ETL task name box. You can edit this field to customize the name.



  4. Click the Entity catalog tab, and select one of the following options:
    • Shared Entity Catalog:Retain the default selection to share the entity catalog with the AWS API ETL, which extracts infrastructure data of the AWS resources.

      • From the Sharing with Entity Catalog list, select the entity catalog name that is shared between ETLs.
    • Private Entity Catalog: Select if only this ETL is used for extracting data from the AWS resources.

  5. Click the Amazon Web Services Connection tab, and configure the following properties:

    PropertyDescription
    Access Key IDSpecify the access key ID of the IAM user that you created during the preconfiguration procedure. For example, a typical access key ID looks like: AMAZONACSKEYID007EXAMPLE.
    Secret Access KeySpecify the secret access key that is associated with the access key ID. For example, a typical secret access key looks like: wSecRetAcsKeYY712/K9POTUS/BCZthIZIzprvtEXAMPLEKEY.
    S3 Bucket nameSpecify the name of the S3 bucket where you store the billing reports.
    Report prefixSpecify the prefix that is attached to the report. (The prefix corresponds to the directory level in the S3 bucket hierarchy.)
    Report nameSpecify the billing report name.
    Business Service Tag KeySpecify the business service tag key that you used in the AWS API ETL for collecting business service data. The AWS API ETL creates business service entities in the Workspace, and maps resources to each business service. The AWS Cost and Usage ETL uses tags of resources to organize resource costs under business services.
    The default tag key is Service.
    Use proxy

    Specify whether you want to configure a proxy server, and provide the following details. The default selection is No.

    • The fully qualified domain name and the port number of the proxy server host.
    • If the proxy server requires authentication, select Yes, and specify the proxy server user name and password.

    By default, the proxy server uses the HTTPS protocol for communication.


    The following image shows sample configuration values for the basic properties.



  6. (Optional) Override the default values of properties in the following tabs:

    PropertyDescription
    Module selection

    Select one of the following options:

    • Based on datasource: This is the default selection.
    • Based on Open ETL template: Select only if you want to collect data that is not supported by TrueSight Capacity Optimization.
    Module descriptionA short description of the ETL module.
    Execute in simulation modeBy default, the ETL execution in simulation mode is selected to validate connectivity with the data source, and to ensure that the ETL does not have any configuration issues. In the simulation mode, the ETL does not load data into the database. This option is useful when you want to test a new ETL task. To run the ETL in the production mode, select No.
    BMC recommends that you run the ETL in the simulation mode after ETL configuration and then run it in the production mode.

    PropertyDescription
    Task groupSelect a task group to classify the ETL.
    Running on schedulerSelect one of the following schedulers for running the ETL:
    • Primary Scheduler: Runs on the Application Server.
    • Generic Scheduler: Runs on a separate computer.
    • Remote: Runs on remote computers.
    Maximum execution time before warningIndicates the number of hours, minutes, or days for which the ETL must run before generating warnings or alerts, if any.
    Frequency

    Select one of the following frequencies to run the ETL:

    • Predefined: This is the default selection. Select an hourly, daily, or weekly frequency, and then select a time and a day to start the ETL run accordingly.
    • Custom: Specify a custom frequency, select an appropriate unit of time, and then specify a day and a time to start the ETL run.

  7. Click Save.
    The details of the newly configured AWS Cost and Usage ETL are displayed.

(Optional) B. Configuring the advanced properties

You can configure the advanced properties to change the way the ETL works and to define the data collection period.

To configure the advanced properties:

  1. On the Add ETL page, click Advanced.
  2. Configure the following properties:

    PropertyDescription
    Run configuration nameSpecify the name that you want to assign to this ETL task configuration. The default configuration name is displayed. You can use this name to differentiate between the run configuration settings of ETL tasks.
    Deploy statusSelect the deploy status for the ETL task. For example, you can initially select Test and change it to Production after verifying that the ETL run results are as expected.
    Log levelSpecify the level of details that you want to include in the ETL log file. Select one of the following options:
    • 1 - Light: Select to add the bare minimum activity logs to the log file.
    • 5 - Medium: Select to add the medium-detailed activity logs to the log file.
    • 10 - Verbose: Select to add detailed activity logs to the log file.

    Use log level 5 as a general practice. You can select log level 10 for debugging and troubleshooting purposes.

    Datasets

    Specify the datasets that you want to add to the ETL run configuration. The ETL collects data of metrics that are associated with these datasets.

    1. Click Edit.
    2. Select one (click) or more (shift+click) datasets from the Available datasets list and click >> to move them to the Selected datasets list.
    3. Click Apply.

    The ETL collects data of metrics associated with the datasets that are available in the Selected datasets list.

    To define a period during which the ETL must collect data, click Advanced, and configure the following property:

    Property

    Description
    Extraction mode
    The extraction mode denotes the data collection period. Depending on the period you want to collect data for, select one of the following options:
    • Regular - daily import: Select to collect daily data. In the first run, the ETL collects data for the past six months. On the subsequent runs, it collects the daily data (after the last counter value).
    • Historical - import historical data (do not use for daily scheduling): Select to collect the historical data for the period that you specify in the Extraction months property.
      The default historical data collection period is 12 months.

    Recommendation

    BMC recommends that you create a separate Run configuration to extract historical data. Do not use the same Run configuration for your daily runs and historical data extraction.

    By default, Regular is selected. After you run the ETL, the last counter value (LAST_EXTRACTION_MAX_TIME) is set to the day when the data was last extracted.

    PropertyDescription
    List of properties

    Specify additional properties for the ETL that act as user inputs during run. You can specify these values now or you can do so later by accessing the "You can manually edit ETL properties from this page" link that is displayed for the ETL in the view mode.

    1. Click Add.
    2. In the etl.additional.prop.n field, specify an additional property.
    3. Click Apply.
      Repeat this task to add more properties.

    PropertyDescription
    Empty dataset behaviorSpecify the action for the loader if it encounters an empty dataset:
    • Warn: Generate a warning about loading an empty dataset.
    • Ignore: Ignore the empty dataset and continue parsing.
    ETL log file nameThe name of the file that contains the ETL run log. The default value is: %BASE/log/%AYEAR%AMONTH%ADAY%AHOUR%MINUTE%TASKID
    Maximum number of rows for CSV outputA numeric value to limit the size of the output files.
    CSV loader output file nameThe name of the file that is generated by the CSV loader. The default value is: %BASE/output/%DSNAME%AYEAR%AMONTH%ADAY%AHOUR%ZPROG%DSID%TASKID
    Capacity Optimization loader output file nameThe name of the file that is generated by the TrueSight Capacity Optimization loader. The default value is: %BASE/output/%DSNAME%AYEAR%AMONTH%ADAY%AHOUR%ZPROG%DSID%TASKID
    Detail mode
    Specify whether you want to collect raw data in addition to the standard data. Select one of the following options:
    • Standard: Data will be stored in the database in different tables at the following time granularities: Detail (configurable, by default: 5 minutes), Hourly, Daily, and Monthly.
    • Raw also: Data will be stored in the database in different tables at the following time granularities: Raw (as available from the original data source), Detail (configurable, by default: 5 minutes), Hourly, Daily, and Monthly.
    • Raw only: Data will be stored in the database in a table only at Raw granularity (as available from the original data source).

    For more information, see Accessing data using public views Open link and Sizing and scalability considerations. Open link

    Remove domain suffix from datasource name (Only for systems) Select True to remove the domain from the data source name. For example, server.domain.com will be saved as server. The default selection is False.
    Leave domain suffix to system name (Only for systems) Select True to keep the domain in the system name. For example: server.domain.com will be saved as is. The default selection is False.
    Update grouping object definition (Only for systems) Select True if you want the ETL to update the grouping object definition for a metric that is loaded by the ETL. The default selection is False.
    Skip entity creation (Only for ETL tasks sharing lookup with other tasks) Select True if you do not want this ETL to create an entity and discard data from its data source for entities not found in Capacity Optimization. It uses one of the other ETLs that share a lookup to create a new entity. The default selection is False.

    PropertyDescription
    Hour maskSpecify a value to run the task only during particular hours within a day. For example, 0 – 23 or 1, 3, 5 – 12.
    Day of week maskSelect the days so that the task can be run only on the selected days of the week. To avoid setting this filter, do not select any option for this field.
    Day of month maskSpecify a value to run the task only on the selected days of a month. For example, 5, 9, 18, 27 – 31.
    Apply mask validationSelect False to temporarily turn off the mask validation without removing any values. The default selection is True.
    Execute after timeSpecify a value in the hours:minutes format (for example, 05:00 or 16:00) to wait before the task is run. The task run begins only after the specified time is elapsed.
    EnqueueableSpecify whether you want to ignore the next run command or run it after the current task. Select one of the following options:
    • False: Ignores the next run command when a particular task is already running. This is the default selection.
    • True: Starts the next run command immediately after the current running task is completed.

  3. Click Save.
    The ETL tasks page shows the details of the newly configured AWS Cost and Usage ETL.

Step III. Run the ETL

After you configure the ETL, you can run it to collect data. You can run the ETL in the following modes:

A. Simulation mode: Only validates connection to the data source, does not collect data. Use this mode when you want to run the ETL for the first time or after you make any changes to the ETL configuration.

B. Production mode: Collects data from the data source.

Important

Ensure that you first run the AWS API ETL before running the AWS Cost and Usage ETL.

A. Running the ETL in the simulation mode

To run the ETL in the simulation mode:

  1. In the TrueSight Capacity Optimization console, navigate to Administration ETL & System Tasks, and select ETL tasks.
  2. On the ETL tasks page, click the ETL. The ETL details are displayed.



  3. In the Run configurations table, click Edit  to modify the ETL configuration settings.
  4. On the Run configuration tab, ensure that the Execute in simulation mode option is set to Yes, and click Save.
  5. Click Run active configuration. A confirmation message about the ETL run job submission is displayed.
  6. On the ETL tasks page, check the ETL run status in the Last exit column.
    OK Indicates that the ETL ran without any error. You are ready to run the ETL in the production mode.
  7.  If the ETL run status is Warning, Error, or Failed:
    1. On the ETL tasks page, click  in the last column of the ETL name row.
    2. Check the log and reconfigure the ETL if required.
    3. Run the ETL again.
    4. Repeat these steps until the ETL run status changes to OK.

B. Running the ETL in the production mode

You can run the ETL manually when required or schedule it to run at a specified time.

Running the ETL manually

  1. On the ETL tasks page, click the ETL. The ETL details are displayed.
  2. In the Run configurations table, click Edit  to modify the ETL configuration settings. The Edit run configuration page is displayed.
  3. On the Run configuration tab, select No for the Execute in simulation mode option, and click Save.
  4. To run the ETL immediately, click Run active configuration. A confirmation message about the ETL run job submission is displayed.
    When the ETL is run, it collects data from the source and transfers it to the TrueSight Capacity Optimization database.

Scheduling the ETL run

By default, the ETL is scheduled to run daily. You can customize this schedule by changing the frequency and period of running the ETL.

To configure the ETL run schedule:

  1. On the ETL tasks page, click the ETL, and click Edit. The ETL details are displayed.

  2. On the Edit task page, do the following, and click Save:

    • Specify a unique name and description for the ETL task.
    • In the Maximum execution time before warning field, specify the duration for which the ETL must run before generating warnings or alerts, if any.
    • Select a predefined or custom frequency for starting the ETL run. The default selection is Predefined.
    • Select the task group and the scheduler to which you want to assign the ETL task.
  3. Click Schedule. A message confirming the scheduling job submission is displayed.
    When the ETL runs as scheduled, it collects data from the source and transfers it to the TrueSight Capacity Optimization database.

Step IV. Verify data collection

Verify that the ETL ran successfully and the AWS cost data is refreshed in the TrueSight console.

To verify whether the ETL ran successfully:

  1. In the TrueSight Capacity Optimization console, click Administration > ETL and System Tasks > ETL tasks.
  2. In the Last exec time column corresponding to the ETL name, verify that the current date and time are displayed.

To verify whether the AWS cost data is refreshed:

  1. Log in to the TrueSight console.
  2. Click Cloud Cost Control, and verify whether the AWS cost data is refreshed.

Related topics

Working with ETLs Open link

Analyzing and forecasting multi-cloud costs

Amazon Web Services documentation Open link

Was this page helpful? Yes No Submitting... Thank you

Comments