Control-M for Google Dataproc
Google Cloud Platform (GCP) Dataproc is a managed service that enables you to perform cloud-based big data processing and machine learning.
Control-M for Google Dataproc enables you to do the following:
- Connect to the Google Cloud Platform from a single computer with secure login, which eliminates the need to provide authentication.
- Trigger jobs based on any workflow template created on Google Dataproc.
- Integrate Dataproc jobs with other Control-M jobs into a single scheduling environment.
- Monitor the Dataproc status and view the results in the Monitoring domain.
- Attach an SLA job to your entire Google Dataproc service.
- Introduce all Control-M capabilities to Google Dataproc, including advanced scheduling criteria, complex dependencies, quantitative and control resources, and variables.
- Run 50 Google Dataproc jobs simultaneously per Control-M/Agent.
Setting up Control-M for Google Dataproc
This procedure describes how to install the Google Dataproc plug-in, create a connection profile, and define a Google Dataproc job in Helix Control-M and Automation API.
Before you Begin
- Verify that Automation API is installed, as described in Setting up the API.
- Verify that Agent version 9.0.21.080 or later is installed.
Begin
- On the Agent host, set the Java environment variable by running one of the following commands through a command line:
- Linux:
- Bourne shell/bash: export BMC_INST_JAVA_HOME=<java_11_directory>
- csh/tcsh: setenv BMC_INST_JAVA_HOME <java_11_directory>
- Windows: set BMC_INST_JAVA_HOME="<java_11_directory>"
- Linux:
- Run one of the following API commands:
- For a fresh installation, use the provision image command:
- Linux: ctm provision image GDP_plugin.Linux
- Windows: ctm provision image GDP_plugin.Windows
- For an upgrade, use the following command:
ctm provision agent::update
- For a fresh installation, use the provision image command:
- Create a GCP Dataproc connection profile in Helix Control-M or Automation API, as follows:
- Helix Control-M: Creating a Centralized Connection Profile with GCP Dataproc Connection Profile Parameters
- Automation API: ConnectionProfile:GCP Dataproc
- Define a GCP Dataproc job in Helix Control-M or Automation API, as follows:
- Helix Control-M: Create a Job and then define specific GCP Dataproc parameters in GCP Dataproc Job parameters.
- Automation API: Job:GCP Dataproc
Was this page helpful? Yes No
Submitting...
Thank you
Comments
Log in or register to comment.