Control-M for GCP Dataproc

Google Cloud Platform (GCP) Dataproc enables you to perform cloud-based big data processing and machine learning.

Control-M for GCP Dataproc enables you to do the following:

  • Execute single or Workflow Template GCP Dataproc jobs.
  • Manage GCP Dataproc credentials in a secure connection profile.
  • Connect to any GCP Dataproc endpoint.
  • Integrate GCP Dataproc jobs with other Control-M jobs into a single scheduling environment.
  • Monitor the status, results, and output of GCP Dataproc jobs in the Monitoring domain.
  • Attach an SLA job to your GCP Dataproc jobs.
  • Introduce all Control-M capabilities to Control-M for GCP Dataproc including advanced scheduling criteria, complex dependencies, Resource Pools, Lock Resources, and variables.
  • Run 50 GCP Dataproc jobs simultaneously per Agent.

Control-M for GCP Dataproc Compatibility

The following table lists the prerequisites that are required to use the GCP Dataproc plug-in, each with its minimum required version.

ComponentVersion
Control-M/EM9.0.20.200
Control-M/Agent9.0.20.201
Control-M Application Integrator9.0.20.201
Control-M Web9.0.20.200
Control-M Automation API9.0.20.250

Control-M for GCP Dataproc is supported on Control-M Web and Control-M Automation API, but not on Control-M client.

To download the required installation files for each prerequisite, see Obtaining Control-M Installation Files.

Setting up Control-M for GCP Dataproc

This procedure describes how to deploy the GCP Dataproc plug-in, create a connection profile, and define a GCP Dataproc job in Control-M Web and Automation API.

Note

Integration plug-ins released by BMC require an Application Integrator installation at your site. However, these plug-ins are not editable and you cannot import them into Application Integrator. To deploy these integrations to your Control-M environment, you import them directly into Control-M using Control-M Automation API.

Before You Begin

Verify that Automation API is installed, as described in Automation API Installation.

Begin

  1. Create a temporary directory to save the downloaded files.

  2. Download the GCP Dataproc plug-in from the Control-M for GCP Dataproc download page in the EPD site.
  3. Install the GCP Dataproc plug-in via one of the following methods:
    • Versions 9.0.21 or Higher: Use the Provision service of Automation API, as follows:
      1. As an administrator on the Control-M/EM Server, store the downloaded zip file in the following location.
        Within several minutes, the zip file is available in all Control-M interfaces associated with the Control-M/EM.
        • Linux: $HOME/ctm_em/AUTO_DEPLOY
        • Windows: <EM_HOME>\AUTO_DEPLOY
      2. As an application user on the Agent machine, run the provision image command, as follows:
        • Linux: ctm provision image GDP_plugin.Linux
        • Windows: ctm provision image GDP_plugin.Windows
    • Versions Lower than 9.0.21: Use the Deploy service of Automation API, as described in deploy jobtype.
  4. Create a GCP Dataproc connection profile in Control-M Web or Automation API, as follows:
  5. Define a GCP Dataproc job in Control-M Web or Automation API, as follows:

Note

To remove this plug-in from an Agent, follow the instructions in Removing a Plug-in. The plug-in ID is GDP042022.

Was this page helpful? Yes No Submitting... Thank you

Comments