Amazon Web Services - AWS API Extractor
Use the Amazon Web Services - AWS API Extractor to collect configuration and performance data of your virtual machines (EC2 instances) that are provisioned in the Amazon Web Services (AWS) cloud. The collected data is used for analyzing and optimizing the capacity of your AWS infrastructure.
This ETL uses the AWS Java SDK version 1.11.60 to connect to AWS, and makes API calls to the following AWS services:
- EC2: For metrics of EC2 instances and EBS volumes
- CloudWatch: For metrics of EC2 instances, EBS volumes, and Auto Scaling groups
- Auto Scaling: For metrics of Auto Scaling groups
Depending on your requirement, you can configure the ETL to collect data from a single or multiple AWS accounts. When configured for multiple accounts, one of the accounts is used as the main account to retrieve data from all the accounts.
If you apply tags to organize your AWS resources by related business services, you can configure the ETL to use these tags to display the AWS metrics by business services.This ETL works in conjunction with the Amazon Web Services - Cost and Usage Extractor, which collects cost and usage details of AWS entities.
Collecting data by using the AWS API ETL
To collect data by using the AWS API ETL, do the following tasks:
I. Complete the preconfiguration tasks.
II. Configure the ETL.
III. Run the ETL.
Depending on your AWS account setup, select a tab and complete the steps:
Step | Details |
---|---|
Configure a policy to specify the permissions for the IAM user. |
|
Create an IAM user. You will need to specify the access key ID and the secret key of this user while configuring the ETL. The AWS SDK requires these keys to automatically sign the requests that the ETL sends to AWS. |
|
Tag your resources by using a business service tag key name such as Service to organize the AWS resources by business services.You will need to specify this business service tag key name while configuring the ETL. |
For information about tagging your resources, see Collecting business service data. |
Basic requirements
Generate an external ID, which you will need to use when you configure the additional AWS accounts. The external ID is an alphanumeric string. Use any alphanumeric string or use a tool, such as GUID UNIX, to generate it.
To organize your resources by business services, ensure that you tag your resources by using a business service tag key name such as Service. You need to specify this business service tag key name while configuring the ETL.
For more information, see Tagging your Amazon EC2 resources.
Configure the main AWS account
Step | Details |
---|---|
Access the main AWS account. |
Log on to the AWS Management Console. |
Obtain the AWS account ID and note it down. |
|
Configure a policy (tsco-aws-etl-policy) to specify permissions for the user of the main AWS account. |
|
Create an IAM user (tsco-etl-user) in the main account and assign the policy (tsco-aws-etl-policy) to the user. You need the access key details of this account while configuring the ETL. The access keys include a key ID and a secret key. The AWS SDK requires these keys to automatically sign the requests that the ETL sends to AWS. For more information about managing access keys for IAM users, see Managing Access Keys for IAM Users . |
|
Configure the additional AWS account
You must repeat these steps for every additional AWS account.
Step | Details |
---|---|
Access the additional AWS account. |
Log in to the AWS Management Console. |
Obtain the account ID and note it down. You will need to enter this account ID when configuring a policy in the main AWS account to include the additional account details. You will also need to enter it while configuring the ETL. |
See the Obtain the AWS account ID step in the Configure the Main AWS account section. |
Configure a policy (tsco-aws-etl-policy) to specify permissions for the user of the additional AWS account. |
See the Configure a policy step in the Configure the main AWS account section. |
Create a cross-account access role (tsco-cross-account-role). This step enables the main AWS account user (tsco-etl-user) to have federated read-only access to the AWS services in the additional account and to enable account switching. |
|
Access the main AWS account again. |
Log in to the AWS Management Console. |
Configure a policy file (tsco-assume-role-policy.json) to include the additional account details. If you are configuring the first additional AWS account, you need to create a policy file. Else, you need to update the existing file with the additional AWS account details. Information A single policy file can include details of all the additional AWS accounts. |
|
Enable the policy (tsco-assume-role-policy.json) that includes additional AWS account details in the main AWS account.
|
|
If your setup is behind a firewall, provide access to the following endpoints:
- https://<region>.amazonaws.com/
- https://monitoring.<region>.amazonaws.com/
where <region> is one of the regions in AWS. For more information about regions, see Regions and Availability Zones .
Important
The ETL requires access to all regions even if your Amazon instances are provisioned in some of the regions.
You must configure the ETL to connect to AWS for data collection. ETL configuration includes specifying the basic and optional advanced properties. While configuring the basic properties is sufficient, you can optionally configure the advanced properties for additional customization.
A. Configuring the basic properties
Some of the basic properties display default values. You can modify these values if required.
To configure the basic properties:
- In the TrueSight Capacity Optimization console, navigate to Administration > ETL & System Tasks, and select ETL tasks.
On the ETL tasks page, click Add > Add ETL. The Add ETL page displays the configuration properties. You must configure properties in the following tabs: Run configuration, Entity catalog, and Amazon Web Services Connection
On the Run Configuration tab, select Amazon Web Services - AWS API Extractor from the ETL Module list. The name of the ETL is displayed in the ETL task name field. You can edit this field to customize the name.
- Click the Entity catalog tab, and select one of the following options:
Shared Entity Catalog:Retain the default selection to share the entity catalog with the AWS Cost and Usage ETL, which extracts the cost and usage data of entities.
- From the Sharing with Entity Catalog list, select the entity catalog name that is shared between ETLs.
- Private Entity Catalog: Select if this is the only ETL that extracts data from the AWS resources.
Click the Amazon Web Services Connection tab, and configure the following properties:
Property Description AWS Account access mode Depending on your AWS account setup, select Single or Multiple. You must use the values that you obtained during the preconfiguration procedure.
- Access Key ID: Specify the access key ID of the IAM user of the AWS account. For example, a typical Access Key ID might look like this:
AMAZONACSKEYID007EXAMPLE
. - Secret Access Key: Specify the secret access key associated with the Access Key ID. For example, a typical Secret Access Key might look like this:
wSecRetAcsKeYY712/K9POTUS/BCZthIZIzprvtEXAMPLEKEY
.
For multiple accounts, specify the following additional details:
- Cross-account name and external ID
- IDs of additional accounts that you configured
Use proxy Specify whether you want to configure a proxy server, and provide the following details. The default selection is No.
- The fully qualified domain name and the port number of the proxy server host.
- If the proxy server requires authentication, select Yes, and specify the proxy server user name and password.
By default, the proxy server uses the HTTPS protocol for communication.
Business Service hierarchy If you want to create and view AWS data by business services, retain the default selection of Create Business Service hierarchy based on specified tag key. Specify the appropriate tag key name. For example, Service.
Example scenario:
You have VMs that are tagged as follows:- AS1: {user=John, Purpose=Dev, Service=Data Solutions}
- vl-pub-bco-qa35: {user=Adam, Purpose=Production, Service=Data Solutions}
- vl-pun-bco-qa20: {user=Jane, Purpose=QA, Service=Data Solutions}
When you run the ETL, data is displayed in a hierarchy as follows:
If you do not use business services, data is displayed as follows:
The following image shows sample configuration values for the basic properties.- Access Key ID: Specify the access key ID of the IAM user of the AWS account. For example, a typical Access Key ID might look like this:
(Optional) Override the default values of properties in the following tabs:
- Click Save.
The ETL tasks page shows the details of the newly configured AWS API ETL.
(Optional) B. Configuring the advanced properties
You can configure the advanced properties to change the way the ETL works or to collect additional metrics.
While configuring these advanced properties, you will need to specify the path to the JSON files that contain the instance type configuration metrics and other additional metrics. For more information, see Collecting data for additional AWS metrics.
To configure the advanced properties:
- On the Add ETL page, click Advanced.
Configure the following properties:
- Click Save.
The ETL tasks page shows the details of the newly configured AWS API ETL.
After you configure the ETL, you can run it to collect data. You can run the ETL in the following modes:
A. Simulation mode: Only validates connection to the data source, does not collect data. Use this mode when you want to run the ETL for the first time or after you make any changes to the ETL configuration.
B. Production mode: Collects data from the data source.
A. Running the ETL in the simulation mode
To run the ETL in the simulation mode:
- In the TrueSight Capacity Optimization console, navigate to Administration > ETL & System Tasks, and select ETL tasks.
- On the ETL tasks page, click the ETL. The ETL details are displayed.
- In the Run configurations table, click Edit to modify the ETL configuration settings.
- On the Run configuration tab, ensure that the Execute in simulation mode option is set to Yes, and click Save.
- Click Run active configuration. A confirmation message about the ETL run job submission is displayed.
- On the ETL tasks page, check the ETL run status in the Last exit column.
OK Indicates that the ETL ran without any error. You are ready to run the ETL in the production mode. - If the ETL run status is Warning, Error, or Failed:
- On the ETL tasks page, click in the last column of the ETL name row.
- Check the log and reconfigure the ETL if required.
- Run the ETL again.
- Repeat these steps until the ETL run status changes to OK.
B. Running the ETL in the production mode
You can run the ETL manually when required or schedule it to run at a specified time.
Running the ETL manually
- On the ETL tasks page, click the ETL. The ETL details are displayed.
- In the Run configurations table, click Edit to modify the ETL configuration settings. The Edit run configuration page is displayed.
- On the Run configuration tab, select No for the Execute in simulation mode option, and click Save.
- To run the ETL immediately, click Run active configuration. A confirmation message about the ETL run job submission is displayed.
When the ETL is run, it collects data from the source and transfers it to the TrueSight Capacity Optimization database.
Scheduling the ETL run
By default, the ETL is scheduled to run daily. You can customize this schedule by changing the frequency and period of running the ETL.
To configure the ETL run schedule:
- On the ETL tasks page, click the ETL, and click Edit. The ETL details are displayed.
On the Edit task page, do the following, and click Save:
- Specify a unique name and description for the ETL task.
- In the Maximum execution time before warning field, specify the duration for which the ETL must run before generating warnings or alerts, if any.
- Select a predefined or custom frequency for starting the ETL run. The default selection is Predefined.
- Select the task group and the scheduler to which you want to assign the ETL task.
Click Schedule. A message confirming the scheduling job submission is displayed.
When the ETL runs as scheduled, it collects data from the source and transfers it to the TrueSight Capacity Optimization database.
Verify that the ETL ran successfully and check whether the AWS data is refreshed in the Workspace.
To verify whether the ETL ran successfully:
- In the TrueSight Capacity Optimization console, click Administration > ETL and System Tasks > ETL tasks.
- In the Last exec time column corresponding to the ETL name, verify that the current date and time are displayed.
To verify that the AWS data is refreshed:
- In the TrueSight Capacity Optimization console, click Workspace.
- Expand (Domain name) > Systems > AWS Cloud > Instances.
- In the left pane, verify that the hierarchy displays the new and updated AWS instances that you have provisioned in the AWS cloud.
- Click an AWS virtual machine instance, and click the Metrics tab in the right pane.
- Check if the Last Activity column in the Configuration data and Performance metrics tables displays the current date.
The following image shows sample metrics data. To learn more about these metrics and other related concepts, see Entities, lookup information, and metrics for AWS API ETL.
Where to go from here
After data is collected, you can analyze the cost of your AWS resources based on different dimensions. For more information, see Analyzing and forecasting multi-cloud costs.
Comments
Log in or register to comment.