Control-M for Databricks


Databricks is a cloud-based data analytics platform that enables you to process and analyze large workloads of data.

Control-M for Databricks enables you to do the following:

  • Execute Databricks jobs.
  • Manage Databricks credentials in a secure connection profile.
  • Connect to any Databricks endpoint.
  • Integrate Databricks jobs with other Control-M jobs into a single scheduling environment.
  • Monitor the status, results, and output of Databricks jobs in the Monitoring domain.
  • Attach an SLA job to your Databricks jobs.
  • Introduce all Control-M capabilities to Control-M for Databricks including advanced scheduling criteria, complex dependencies, Resource Pools, Lock Resources, and variables.
  • Run 50 Databricks jobs simultaneously per Agent.

Setting up Control-M for Databricks

This procedure describes how to install the Databricks plug-in, create a connection profile, and define a Databricks job in Helix Control-M and Automation API.

Before you Begin

  • Verify that Automation API is installed, as described in Setting up the API.
  • Verify that Agent version 9.0.21.080 or later is installed.

Begin

  1. On the Agent host, set the Java environment variable by running one of the following commands through a command line:
    • Linux:
      • Bourne shell/bash: export BMC_INST_JAVA_HOME=<java_11_directory>
      • csh/tcsh: setenv BMC_INST_JAVA_HOME <java_11_directory>
    • Windows: set BMC_INST_JAVA_HOME="<java_11_directory>"
  2. Run one of the following API commands:
    • For a fresh installation, use the provision image command:
      • Linux: ctm provision image DBX_plugin.Linux
      • Windows: ctm provision image DBX_plugin.Windows
    • For an upgrade, use the following command:
      ctm provision agent::update
  3. Create a Databricks connection profile in Helix Control-M or Automation API, as follows:
  4. Define a Databricks job in Helix Control-M or Automation API, as follows:

Note

To remove this plug-in from an Agent, see Removing a Plug-in. The plug-in ID is DBX032022.

Change Log

The following table provides details about changes that were introduced in new versions of this plug-in:

Plug-in Version

Details

1.0.00

Initial release.

1.0.01

Multiple task enhancement.

1.0.02

Idempotency enhancement.

1.0.03

New job icon.

1.0.04

Removal of the Job Name attribute.

1.0.05

Semantic changes.

1.0.06

Failure Tolerance job parameter added.