Control-M for GCP Dataplex

GCP Dataplex is an extract, transform, and load (ETL) service that enables you to visualize and manage data in GCP BigQuery and the cloud.

Control-M for GCP Dataplex enables you to do the following:

  • Execute any of the following job actions:
    • Data Quality Task: Executes a predefined data quality task in GCP BigQuery or Google Cloud Storage locations, and defines data controls in BigQuery environments.
    • Custom Spark Task: Executes a predefined, scheduled Apache Spark task to analyze and process your data.
    • Data Profiling Scan: Executes a predefined data scan to identify shared statistical characteristics between BigQuery tables.
    • Data Quality Scan: Executes a predefined data quality scan that validates your data and logs alerts when the data fails validation.
  • Manage GCP Dataplex credentials in a secure connection profile.
  • Connect to any GCP Dataplex endpoint.
  • Integrate GCP Dataplex jobs with other Control-M jobs into a single scheduling environment.
  • Monitor the status, results, and output of GCP Dataplex jobs in the Monitoring domain.
  • Attach an SLA job to your GCP Dataplex jobs.
  • Introduce all Control-M capabilities to Control-M for GCP Dataplex, including advanced scheduling criteria, complex dependencies, Resource Pools, Lock Resources, and variables.
  • Run 50 GCP Dataplex jobs simultaneously per Agent.

Setting Up Control-M for GCP Dataplex

This procedure describes how to install the GCP Dataplex plug-in, create a connection profile, and define a GCP Dataplex job in Helix Control-M and Automation API.

Before You Begin

  • Verify that Automation API is installed, as described in Setting up the API.
  • Verify that Agent version 9.0.21.080 or later is installed.

Begin

  1. On the Agent host, set the Java environment variable by running one of the following commands through a command line:
    • Linux:
      • Bourne Shell/Bash: export BMC_INST_JAVA_HOME=<java_11_directory>
      • csh/tcsh: setenv BMC_INST_JAVA_HOME <java_11_directory>
    • Windows: set BMC_INST_JAVA_HOME="<java_11_directory>"
  2. Run one of the following API commands:
    • For a fresh installation, run the following provision image command:
      • Linux: ctm provision image GCP_Dataplex_plugin.Linux
      • Windows: ctm provision image GCP_Dataplex_plugin.Windows
    • To upgrade, run the following command:
      ctm provision agent::update
  3. Create a GCP Dataplex connection profile in Helix Control-M or Automation API, as follows:
  4. Define a GCP Dataplex job in Helix Control-M or Automation API, as follows:

Note

To remove this plug-in from an Agent, see Removing a Plug-in. The plug-in ID is GDQ112023.

Was this page helpful? Yes No Submitting... Thank you

Comments