Steps in Atrium Integrator transformation

You can add intermediate steps to a transformation in a job that you create through the Integration Job Builder wizard, or to a custom job you create through Atrium Integrator Spoon.

The following table lists the steps that you can add to your transformation to manipulate the data before adding it to BMC Helix CMDB.

BMC supports only the steps listed in the table. For information about additional steps that you can add to your transformation, see the Pentaho online documentation.

Any queries around complex customization using unsupported steps will be addressed selectively and at discretion of BMC. See the topic Atrium Integrator Spoon support in Atrium-Integrator-Spoon.

To access the Atrium Integrator Spoon

Log in to the computer on which Atrium Integrator Spoon is installed.
To open Atrium Integrator Spoon, open the Spoon.bat file at the following location:
<installationDirectory>\diserver\data-integration\

To add a transformation or a job

Select Job or Transformation.
Click the step that you want to add and drag it to the canvas.

The following table lists the intermediate steps for transformations and the descriptions:

Step

What it helps you to do

Input

ARInput

ARInput is a BMC provided step. It allows you to migrate AR System data into BMC Helix CMDB.

ARInput step fields and description (click to expand)

Field and description

Field	Description
General tab
Connection	Specify the AR System server name.
Form Name	Specify the form name from where you want fetch the AR System data.
Chunk Size	Specify the number of records you want to pull in one thread.
Qualification tab
Qualification String	Specify the qualification value.
Fields tab
Field list	Displays the AR System field details. Click Get Fields to populate all the available AR System fields.

CMDBInput

CMDBInput is a BMC provided step. It allows you to migrate data from the BMC Helix CMDB on your development server to the BMC Helix CMDB on your production server.

To understand how the CMDBInput step is used see, Migrating-data-from-one-BMC-Helix-CMDB-server-to-another.

CMDBInput step fields and description (click to expand)

Field and description

Field	Description
Connection	Specify the AR System server name.
Class Name	Specify the CMDB form name from where you want to read the data.
NameSpace	Specify the BMC namespace. For example BMC.Core
Dataset Id	Specify the BMC dataset ID. For example, BMC.ASSET
Attribute List	Displays the list of attributes that belong to the selected class. Click Get Attributes to populate the list.
To Filter Criteria Select Values of Attributes	Define the filter criteria to compare CMDB fields to qualifications value.
Selected Qualifications	Displays the list of qualification strings. For example, in AND 'MarkAsDeleted' = NULL string, AND is the join condition, MarkAsDeleted is the field value and NULL is the qualification value .

ARX File Input

ARX File Input is a BMC provided step. It allows you to migrate data from ARX files into BMC Helix CMDB.

ARX File Input step fields and description (click to expand)

Field and description

Field	Description
File Name	Specify fieldname that will contain the filename(s) to read from.
Chunk Size	Specify number of records you want to pull in one thread.
Schema Names	AR form name that the arx file uses.
Connection	Specify the AR server name.
Fields table	List of arx data fields. The data is pushed in the displayed fields only.
Preview Rows button	Click to view the data in the fields.

Table Input

Table input step allows you to read information from a database, using a connection and SQL. Basic SQL statements are generated automatically.

BMC supports selective vendor databases which include – IBM DB2, MS SQL Server, Oracle, Sybase, and My SQL.

To understand how Table input step is used, see Troubleshooting-issues-with-Atrium-Integrator-migrated-jobs.

For information about adding qualifications to the Table Input step, see the following video:

https://youtu.be/DMpTugESxjM.

This video is recorded using the earlier version of BMC Atrium Core and is valid for BMC Helix CMDB 9.1 and above.

Table Input step fields and description (click to expand)

Field and description

Field	Description
Connection	The database connection from which to read data.
SQL	The SQL statement used to read information from the database connection. You can also click Get SQL select statement... to browse tables and automatically generate a basic select statement.
Enable lazy conversion	Select if you want to avoid unnecessary data type conversions. Enabling this option results in a significant performance improvements.
Replace variables in script?	Select if you want to replace variables in the script. This option allows you to test with or without performing variable substitutions.
Insert data from step	Input step name where Pentaho can expect information to come from. This information can then be inserted into the SQL statement. The locators where Pentaho inserts information is indicated by ? (question marks).
Execute for each row?	Select to perform the data insert for each individual row.
Limit size	Sets the number of lines that is read from the database. Zero (0) means read all lines.

CSV File Input

CSV File input step allows you to migrate data from CSV files into BMC Helix CMDB.

CSV File Input step fields and description (click to expand)

Field and description

Field	Description
Filename	Select the fieldname that will contain the filename(s) to read from.
Delimiter	Specify the file delimiter character used in the target file. Special characters (e.g. CHAR HEX01) can be set with the format $[value], e.g. $[01] or $[6F,FF,00,1F].
Enclosure	Specify the enclosure character used in the target file. Special characters (e.g. CHAR HEX01) can be set with the format $[value], e.g. $[01] or $[6F,FF,00,1F].
NIO buffer size	This is the size of the read buffer. It represents the amount of bytes that is read in one time from disk.
Lazy conversion	The lazy conversion algorithm tries to avoid unnecessary data type conversions and can result in a significant performance improvements if this is possible. For example, reading from a text file and writing back to a text file.
Header row present?	Select this option if the target file contains a header row containing column names.
Add filename to result	Adds the CSV filename(s) to the result of this transformation. A unique list is being kept in memory that can be used in the next job entry in a job, for example in another transformation.
The row number field name (optional)	The name of the Integer field that will contain the row number in the output of this step.
Running in parallel?	Select if you have multiple instances of this step running (step copies) and if you want each instance to read a separate part of the CSV file(s). When reading multiple files, the total size of all files is taken into consideration to split the workload. And make sure that ALL step copies receive all files that need to be read, otherwise, the parallel algorithm will not work correctly. WARNING: For technical reasons, parallel reading of CSV files is only supported on files that don't have fields with line breaks or carriage returns in them.
File encoding	Specify the encoding of the file being read.
Fields Table	Displays an ordered list of fields to be read from the target file.
Preview button	Click to preview the data coming from the target file.
Get Fields button	Click to return a list of fields from the target file based on the current settings (i.e. Delimiter, Enclosure, etc.). All fields identified will be added to the Fields Table.

Text File Input

The Text File Input step allows you to migrate data from a variety of different text-file types.The Text File Input step provides you with the ability to specify a list of files to read, or a list of directories with wild cards in the form of regular expressions. In addition, you can accept filenames from a previous step making filename handling more even more generic.

Text file input step fields and description (click to expand)

Field and description

Field	Description
File tab
File or directory	This field specifies the location and/or name of the input text file. Click Add to add the file/directory/wildcard combination to the list of selected files (grid).
Regular Expression	Specify the regular expression you want to use to select the files in the directory specified in the previous option. For example, you want to process all files that have a .txt extension.
Exclude Regular Expression	Specify the regular expressions you want to exclude.
Selected files table	Displays a list of selected files (or wildcard selections) along with a property specifying if file is required or not. If a file is required and it isn't found, an error is generated. Otherwise, the filename is skipped.
Accept filenames from previous step	This option allows even more flexibility in combination with other steps such as "Get Filenames". You can construct your filename and pass it to this step. This way the filename can come from any source: text file, database table, etc. Select this option to get filenames from previous steps.
Pass through fields from previous step	Select if you want to pass through fields from previous step.
Step to read filenames from	Specify the step from which to read the filenames.
Field in input to use as filename	Text File Input looks in this step to determine which filenames to use.
Show filenames(s)...	Displays a list of all files that will be loaded based on the current selected file definitions.
Show file content	Displays the content of the selected file.
Show content from first data line	Displays the content from the first data line only for the selected file.
Content tab	The content tab enables you to specify the format of the text files that are being read.
File type	Can be either CSV or Fixed length. Based on this selection, Atrium Integrator Spoon launches a different helper GUI when you click the Get fields button in the Fields tab.
Separator	Specify one or more characters that separate the fields in a single line of text. Typically this is ; or a tab. Special characters (e.g. CHAR ASCII HEX01) can be set with the format $[value], for example, $[01] or $[6F,FF,00,1F].
Enclosure	Some fields can be enclosed by a pair of strings to allow separator characters in fields. The enclosure string is optional. If you use repeat an enclosures allow text line 'Not the nine o''clock news.'. With ' the enclosure string, this gets parsed as Not the nine o'clock news. Special characters (e.g. CHAR ASCII HEX01) can be set with the format $[value], for example, $[01] or $[6F,FF,00,1F].
Escape	Specify an escape character (or characters) if you have these types of characters in your data. If you have \ as an escape character, the text 'Not the nine o\'clock news' (with ' the enclosure) gets parsed as Not the nine o'clock news. Special characters (e.g. CHAR HEX01) can be set with the format $[value], for example, $[01] or $[6F,FF,00,1F].
Header	Select if your text file has a header row (first lines in the file). You can specify the number of times the header lines appears.
Footer	Select if your text file has a footer row (last lines in the file). You can specify the number of times the footer row appears.
Wrapped lines	Use if you deal with data lines that have wrapped beyond a specific page limit. Note: Headers and footers are never considered wrapped.
Page layout	Use these options as a last resort when dealing with texts meant for printing on a line printer; use the number of document header lines to skip introductory texts and the number of lines per page to position the data lines.
Compression	Select if your text file is placed in a Zip or GZip archive. Note: Currently, only the first file in the archive is read.
No empty rows	Select if you do not send empty rows to the next steps.
Include filename in output	Select if you want the filename to be part of the output.
Rownum in output	Select if you want the row number to be part of the output
Format	Can be either DOS, UNIX or mixed. UNIX files have lines that are terminated by line feeds. DOS files have lines separated by carriage returns and line feeds. If you specify mixed, no verification is done.
Encoding	Specify the text file encoding to use; leave blank to use the default encoding on your system. To use Unicode, specify UTF-8 or UTF-16. On first use, Spoon searches your system for available encodings.
Limit	Sets the number of lines that is read from the file; zero (0) means read all lines.
Be lenient when parsing dates	Deselect, if you want strict parsing of data fields. If case-lenient parsing is selected, dates like Jan 32nd will become Feb 1st.
The date format locale	This locale is used to parse dates that have been written in full such as "February 2nd, 2006;" parsing this date on a system running in the French (fr_FR) locale would not work because February is called Février in that locale.
Add filenames to result	Select to add output file names as a field in the results.
Error Handling tab	The error handling tab allows you to specify how the step reacts when errors occur.
Ignore errors	Select if you want to ignore errors during parsing.
Skip error lines	Select if you want to skip those lines that contain errors. You can generate an extra file that contains the line numbers on which the errors occurred. Lines with errors are not skipped, the fields that have parsing errors, will be empty (null).
Error count field name	Add a field to the output stream rows. This field contains the number of errors on the line.
Error fields field name	Add a field to the output stream rows. This field contains the field names on which an error occurred.
Error text field name	Add a field to the output stream rows. T This field contains the descriptions of the parsing errors that have occurred.
Warning files directory	When warnings are generated, they are placed in this directory. The name of that file is <warning dir>/filename.<date_time>.<warning extension>.
Error files directory	When errors occur, they are placed in this directory. The name of the file is <errorfile_dir>/filename.<date_time>.<errorfile_extension>.
Failing line numbers file directory	When a parsing error occurs on a line, the line number is placed in this directory. The name of that file is <errorline dir>/filename.<date_time>.<errorline extension>.
Filters tab	The filters tab provides you with the ability to specify the lines you want to skip in the text file.
Filter string	Specify the string for which to search.
Filter position	Specify the position where the filter string has to be at in the line. Zero (0) is the first position in the line. If you specify a value below zero (0), the filter string is searched for in the entire string.
Stop on filter	Specify Y if you want to stop processing the current text file when the filter string is encountered.
Positive match	Includes the rows where the filter condition is found (include). The alternative is that those rows are avoided (exclude).
Fields tab	The fields tab allows you to specify the information about the name and format of the fields being read from the text file.
Name	Name of the field.
Type	Type of the field can be either String, Date or Number.
Format	Select the format in which you want the Type of the field For example, if you select Date in the Type column, you can select a format from the Format list.
Length	For Number: Total number of significant figures in a number. For String: total length of string. For Date: length of printed output of the string (e.g. 4 only gives back the year).
Precision	For Number: Number of floating point digits. For String, Date, Boolean: unused.
Currency	Used to interpret numbers like $10,000.00 or E5.000,00
Decimal	A decimal point can be a "." (10;000.00) or "," (5.000,00)
Group	A grouping can be a dot "," (10;000.00) or "." (5.000,00)
Null If	Treat this value as NULL.
Default	Default value in case the field in the text file was not specified (empty).
Trim type	Type trim this field (left, right, both) before processing.
Repeat	If the corresponding value in this row is empty, repeat the one from the last time it was not empty.

Get Data from XML

Get Data From XML allows you to read data from any type of XML file using XPath specifications. Get Data From XML step can read data from 3 kind of sources (files, stream and url) in 2 modes (user can define files and urls at static mode or in a dynamic way).

Get Data from XML step fields and description (click to expand)

Field and description

Field	Description
File tab	The Files tab is where you define the location of the XML files from which you want to read.
XML source is defined in a field	Select if the previous step is giving XML data in a certain field in the input stream.
XML source is file name	Select if the previous step is giving filenames in a certain field in the input stream. These are read.
Read source as Url	Select if the previous step is giving URLs in a certain field in the input stream. These are read.
Get XML source from a field	Specify the field to read XML, filename or URL from.
File or directory	Specify the location and/or name of the input text file. *Note*: Click Add to add the file/directory/wildcard combination to the list of selected files (grid) below.
Regular Expression	Specify the regular expression you want to use to select the files in the directory specified in the previous option.
Exclude Regular Expression	Specify the regular expressions you want to exclude.
Selected Files	Files from where the data is pulled.
Show filename(s)	Contains a list of selected files (or wildcard selections) and a property specifying if file is required or not. If a file is required and it is not found, an error is generated;otherwise, the file name is skipped.
Content tab
Loop XPath	For every "Loop XPath" location we find in the XML file(s), we will output one row of data. This is the main specification we use to flatten the XML file(s). Click Get XPath nodes to search for the possible repeating nodes in the XML document. Please note that if the XML document is large that this can take a while.
Encoding	The XML filename encoding in case none is specified in the XML documents.
Namespace aware	Select to make the XML document namespace aware.
Ignore comments	Ignore all comments in the XML document while parsing.
Validate XML	Validate the XML prior to parsing. Use a token when you want to replace dynamically in a Xpath field value. A token is between @_ and - (@_fieldname-). Please see the Example 1 to see how it works.
Use token	A token is not related tro XML parsing but to PDI.
Ignore empty file	An empty file is not a valid XML document. Select this if you want to ignore those altogether.
Do not raise an error if no files	Select if you don't want to raise an error if no files are found.
Limit	Limits the number of rows to this number (zero (0) means all rows).
Prune path to handle large files	Almost the same value as the "Loop XPath" property with some exceptions.
Include filename in output?	Select if you want to specify a field name to include the file name (String) in the output of this step.
Rownum in output?	Select if you want to specify a field name to include the row number (Integer) in the output of this step.
Add files to result filesname	Adds the XML filenames read to the result of this transformation. A unique list is being kept in memory that can be used in the next job entry in a job, for example in another transformation.
Fields tab
Name	The name of the output field.
XPath	The path to the element node or attribute to read
Element	The element type to read: Node or Attribute
Type	The data type to convert to
Format	The format or conversion mask to use in the data type conversion
Length	The length of the output data type
Precision	The precision of the output data type
Currency	The currency symbol to use during data type conversion
Decimal	The numeric decimal symbol to use during data type conversion
Group	The numeric grouping symbol to use during data type conversion
Trim type	The type of trimming to use during data type conversion
Repeat	Repeat the column value of the previous row if the column value is empty (null)

Excel Input

Excel Input step enables you to migrate data from Excel files into BMC Helix CMDB.

Note

When you provide input as Time datatype in a Microsoft Excel Input step, the time is not updated in Spoon because Pentaho does not support the Time datatype.

Workaround: Specify time inputs as strings and change the format to HH:mm:ss, and then run the transformation.

An example is shown in the following figure:

Excel Input step fields and description (click to expand)

Field and description

Field	Description
Files tab
File or directory	Specify the file name from where you want to pull the data.
Regular Expression	Specify the regular expression you want to use to select the files in the directory specified in the previous option.
Exclude Regular Expression	Specify the regular expressions you want to exclude.
Selected Files	A list of files that will be used in this step, according to the criteria specified in the previous fields.
Accept filenames from previous step	Select to retrieve a list of filenames from the previous step in this transformation. You must also specify which step you are importing from, and the input field in that step from which you will retrieve the filename data. If you choose this option, the Show filename(s) option will show a preview of the list of filenames.
Sheets tab
List of sheets to read	A list of worksheets that you want to use. If this remains empty, all worksheets in all specified files will be selected. Rows and columns are numbered, starting with 0.
Get sheetname(s)	Click to retrieve a list of worksheets from all of the specified files and give you the option to select some or all of them for this step.
Content tab
Header	Select if there is a header row to skip in the selected worksheets.
No empty row	Select if you want to remove empty rows from the output.
Stop on empty row	Select if you want to stop reading from the current worksheet when an empty row is read.
Limit	Sets a static number of rows to read. If set to 0, there is no set limit.
Encoding	Specifies the text file encoding to use. Leave blank to use the default encoding on your system. To use Unicode, specify UTF-8 or UTF-16. On first use, Spoon searches your system for available encodings.
Spread sheet type (engine)	Specifies which spreadsheet format to expect from the file, regardless of its extension.
Add filenames to result	Select to pass the input filenames to the output.
Error Handling tab
Strict types?	Select if you want PDI to report data type errors in the input.
Ignore errors?	Select if you don't want no errors to be reported during input parsing.
Skip error lines?	Select if you want PDI to skip lines that contain errors. These lines can be dumped to a separate file by specifying a path in the Failing line numbers files directory field below. If this is not selected, lines with errors appear as NULL values in the output.
Warning files directory	Directory in which to create files that contain warning messages regarding input values for each spreadsheet file read. These files will have the extension you specify here.
Error files directory	Directory in which to create files that contain error messages regarding input values for each spreadsheet file read. These files will have the extension you specify here.
Failing line numbers files directory	Directory in which to create files that contain the lines that failed error checks during input validation. These files will have the extension you specify here.
Fields tab
Name	The name of the field.
Type	The field's data type; String, Date or Number.
Length	The length option depends on the field type. Number: total number of significant figures in a number. String: total length of a string. Date: determines how much of the date string is printed or recorded.
Precision	The precision option depends on the field type, but only Number is supported; it returns the number of floating point digits.
Trim type	Truncates the field (left, right, both) before processing. Useful for fields that have no static length.
Repeat	If set to Y, will repeat this value if the next field is empty.
Format	The format mask (number type).
Currency	Symbol used to represent currencies.
Decimal	A decimal point; this is either a dot or a comma.
Grouping	A method of separating units of thousands in numbers of four digits or larger. This is either a dot or a comma.
Additional output fields tab
Full filename field	The full file name including the file path and the file extension. For example, C:\Users\Administrator\Desktop\INC.xls
Sheetname field	The worksheet name you're using.
Sheet row nr field	The current sheet row number.
Row nr written field	Number of rows written.
Short filename field	Only the file name. For example, INC.xls
Extension field	The three- or four-letter file type extension.
Path field	Only the file path. For example, C:\Users\Administrator\Desktop

Output

AROutput

AROutput is a BMC provided step. It allows you to export data onto AR system.

AROutput step fields and description (click to expand)

Field and description

Field	Description
General tab
Step name	Name of the step; this name has to be unique in a single transformation.
Connection	Specify the the database connection to which data is written.
Form Name	Specify the AR System form name where you want to push the data.
Batch Commit Size	Specify the amount of data you want to commit in one batch.
Fallback to single row commit if batch commit fails	Select if you want to proceed with single row commit if batch commit fails.
Duplicates tab
Duplicate Request Action	Select the data duplication conditions.
Match By Request ID	Select if you want to match the data and update by request ID.
Data Handling tab
Require Required Fields	Is selected by default if you select Match by Request ID.
Enforce Pattern Matching	Is selected by default if you select Match by Request ID.
Field Mapping tab
Merge Fields	Use to map the form field to stream field.

CMDBOutput

CMDBOutput is a BMC provided step. It allows you to load data into BMC Helix CMDB.

CMDBOutput step fields and description (click to expand)

Field and description

Field	Description
Step name	Name of the step; this name has to be unique in a single transformation.
Connection	The database connection to which data is written.
Class Name	Specify the CMDB form name from where you want to push the data.
NameSpace	Specify the BMC namespace. For example BMC.Core
Dataset ID	Specify the BMC dataset ID. For example, BMC.ASSET
CMI Commit Size	Not applicable.
Qualification String	Not applicable. Use The key(s) to look up the value(s) section instead.
Only insert new CIs	Select if you want to update only new CIs.
Always insert CIs	Select if you want to transfer all source records to BMC Helix CMDB without checking for duplicates. See, Inserting-or-updating-bulk-records-in-BMC-Helix-CMDB.
Use Cache	Select if you want to insert or update bulk records in BMC Helix CMDB . See, Increasing-cache-size-for-Atrium-Integrator.
Use CheckSum	Select if you want to use the CheckSum feature to update records in BMC Helix CMDB . See, Configuring-the-checksum-value-for-loading-data-into-BMC-Helix-CMDB.
The key(s) to lookup the value(s)	Define the qualification to find the matching data.
Update fields	Define if the input data should update the existing data.
Edit Mapping	Define the mapping between source and target fields.

Table Output

The Table output step allows you to load data into a database table.

BMC supports selective vendor databases which include – IBM DB2, MS SQL Server, Oracle, Sybase, and My SQL.

Table Output step fields and description (click to expand)

Field and description

Field	Description
Step name	Name of the step; this name has to be unique in a single transformation.
Connection	Specify the database connection to which data is written.
Target Schema	Specify the name of the Schema for the table to write data to. This is important for data sources that allow for table names with periods in them.
Target table	Specify the name of the table to which data is written.
Commit size	Use transactions to insert rows in the database table. Commit the connection every N rows if N is larger than zero (0); otherwise, don't use transactions. (Slower). Note: Transactions are not supported on all database platforms.
Truncate table	Select if you want the table to be truncated before the first row is inserted into the table. Note: Don't use this option when you are running the transformation clustered.
Main options tab
Partition data over tables	Use to split the data over multiple tables.
Use batch updates for inserts	Select if you want to use batch inserts. This feature groups inserts statements to limit round trips to the database. This is the fastest option and is enabled by default. This feature is not available under these conditions: If the transformation database is transactional, if you are using Greenplum or PostgreSQL with error handling turned on, if you are using auto-generated keys.
Is the name of the table defined in the field?	Use these options to split the data over one or more tables; the name of the target table is defined in the field you specify. For example if you store customer data in the field gender, the data might end up in tables M and F (Male and Female). There is an option to exclude the field containing the tablename from being inserted into the tables.
Return auto-generated key	Select if you want to get back the key that was generated by inserting a row into the table.
Name of auto-generated key field	Specifies the name of the new field in the output rows that contains the auto-generated key.
SQL	Generates the SQL to create the output table automatically.
Database fields tab
Fields to insert	Enter the Source (Stream) and Table (Target or Existing data) Field name.
Enter field mapping	Define the mapping between source and target fields.

Text File Output

The Text file output step is used to export data to text file format. This is commonly used to generate Comma Separated Values (CSV files) that can be read by spreadsheet applications. It is also possible to generate fixed width files by setting lengths on the fields in the fields tab.

Text File Output step fields and description (click to expand)

Field and description

Field	Description
File tab	The File tab is where you define basic properties about the file being created.
Step name	Name of the step; this name has to be unique in a single transformation.
Filename	Specify the filename and location of the output text file.
Run this as a command instead?	Select if you want to "pipe" the results into the command or script you specify.
Do not create file at start?	Select if you don't want to create the file until the end of the step.
Extension	Adds a point and the extension to the end of the filename. (.txt)
Accept file name from field?	Select if you want to specify the filename(s) in a field in the input stream.
File name field	When the previous option is enabled, you can specify the field that will contain the filename(s) at runtime.
Include stepnr in filename	If you run the step in multiple copies (Launching several copies of a step), the copy number is included in the filename, before the extension.
Include partition nr in filename?	Select if you want to includes the data partition number in the filename.
Include date in filename	Select if you want to includes the system date in the filename.
Include time in filename	Select if you want to includes the system time in the filename.
Specify date format	Select if you want the filename to include a date and time stamp that follows the selection you choose from the drop-down box. Selecting this option disables the previous two options.
Show filename(s)	This option shows a list of the files that will be generated. Note: This is a simulation and among others depends on the number of rows that will go into each file.
Add filenames to result	Select if you want to use the Filename field in constructing the result filename. If deselected, the Filename field is ignored.
Content tab
Append	Select if you want to append lines to the end of the specified file.
Separator	Specify the character that separates the fields in a single line of text. Typically this is ; or a tab.
Enclosure	A pair of strings can enclose some fields. This allows separator characters in fields. The enclosure string is optional.
Force the enclosure around fields?	Select if you want to force all field names to be enclosed with the character specified in the Enclosure property above.
Disable the enclosure fix?	This is for backward compatibility reasons (since version 4.1) related to enclosures and separators. The logic since version 4.1 is: When a string field contains an enclosure it gets enclosed and the enclose itself gets escaped. When a string field contains a separator, it gets enclosed. Check this option, if this logic is not wanted. It has also an extra performance burden since the strings are scanned for enclosures and separators. So when you are sure there is no such logic needed since your strings don't have these characters in there and you want to improve performance, un-check this option.
Header	Select if you want the text file to have a header row. (First line in the file).
Footer	Select if you want the text file to have a footer row. (Last line in the file).
Format	This can be either DOS or UNIX. UNIX files have lines are separated by linefeeds. DOS files have lines separated by carriage returns and line feeds.
Encoding	Specify the text file encoding to use. Leave blank to use the default encoding on your system. To use Unicode specify UTF-8 or UTF-16. On first use, Spoon will search your system for available encodings.
Compression	Allows you to specify the type of compression, .zip or .gzip to use when compressing the output. Note: Currently, only one file is placed in a single archive.
Right pad fields	Select if you want to add spaces to the end of the fields (or remove characters at the end) until they have the specified length.
Fast data dump (no formatting	Select if you want to improve the performance when dumping large amounts of data to a text file by not including any formatting information.
Split every ... rows	If this number N is larger than zero, split the resulting text-file into multiple parts of N rows.
Add Ending line of file	Select if you want to specify an alternate ending row to the output file.
Fields tab	The fields tab is where you define properties for the fields being exported.
Name	The name of the field.
Type	Type of the field can be either String, Date or Number.
Format	The format mask to convert with. See Number Formats for a complete description of format symbols.
Length	The length option depends on the field type follows: Number - Total number of significant figures in a number String - total length of string Date - length of printed output of the string (e.g. 4 only gives back year)
Precision	The precision option depends on the field type as follows: Number - Number of floating point digits String - unused Date - unuse
Currency	Symbol used to represent currencies like $10,000.00 or E5.000,00
Decimal	A decimal point can be a "." (10,000.00) or "," (5.000,00)
Group	A grouping can be a "," (10,000.00) or "." (5.000,00)
Trim type	The trimming method to apply on the string. Trimming only works when there is no field length given.
Null	If the value of the field is null, insert this string into the textfile.
Get	Click to retrieve the list of fields from the input fields stream(s)
Minimal width	Alter the options in the fields tab in such a way that the resulting width of lines in the text file is minimal. So instead of save 0000001, we write 1, etc. String fields will no longer be padded to their specified length.

Insert\Update

The Insert/Update step first looks up a row in a table using one or more lookup keys. If the row can't be found, it inserts the row. If it can be found and the fields to update are the same, nothing is done. If they are not all the same, the row in the table is updated.

Note: If you have multiple rows with the same keys that match, only the first row found is compared. This may lead to different results, depending on if the found row matches with given values or not. The update scenario looks like this: If a difference is found in the case of multiple rows with the same key, an UPDATE statement is fired against the database that updates all rows with the matching keys. This note also applies to the Update step.

Insert\Update step fields and description (click to expand)

Field and description

Field	Description
Step name	Name of the step; this name has to be unique in a single transformation.
Connection	Specify the database connection to which data is written.
Target schema	Specify the name of the Schema for the table to which data is written. This is important for data sources that allow for table names with periods in them.
Target table	Name of the table in which you want to do the insert or update.
Commit size	The number of rows to change (insert / update) before running a commit.
Don't perform any updates	Select if you want the values in the database to be never updated and only inserts are performed.
Key Lookup table	Allows you to specify a list of field values and comparators. You can use the following comparators: =, <>, <, <=, >, >=, LIKE, BETWEEN, IS NULL, IS NOT NULL Note: Click Get fields to retrieve a list of fields from the input stream(s).
Update Fields	Allows you to specify all fields in the table you want to insert/update including the keys. Avoid updates on certain fields by specifying N in the update column. Note: Click Get Update fields to retrieve a list of update fields from the input stream(s).
SQL button	Click SQL to generate the SQL to create the table and indexes for correct operation.

XML Output

This XML Output step allows you to write rows from any source to one or more XML files.

XML Output step fields and description (click to expand)

Field and description

Field	Description
File tab	The file tab is where you set general properties for the XML output file format
Step name	Name of the step; this name has to be unique in a single transformation.
Filename	Specify the file name and location of the output text file
Do not create file at start	Select if you don't want to create the file until the end of the step.
Extension	Adds a period and the extension to the end of the file name (.xml)
Include stepnr in filename	If you run the step in multiple copies (see also Launching Several Copies of a step), the copy number is included in the file name, before the extension
Include date in filename	Select if you want to include the system date in the file name
Include time in filename	Select if you want to include the system date in the file name
Specify Date time format	Select if you want the filename to include a date and time stamp that follows the selection you choose from the drop-down box. Selecting this option disables the previous two options.
Add filenames to result	Select if you want to use the Filename field in constructing the result filename. If deselected, the Filename field is ignored.
Content tab
Zipped	Select if you want the XML file to be stored in a ZIP archive.
Encoding	The encoding to use. This encoding is specified in the header of the XML file.
Parent XML element	The name of the root element in the XML document.
Row XML element	The name of the row element to use in the XML document.
Split every ... rows.	The maximum number of rows of data to put in a single XML file before another is created.
Omit null values from XML Output	Select this option to exclude elements containing null values from the output XML file. Clear this option to include elements containing null values in the output XML file. You can specify another value to replace null with the Null field in the Fields tab.
Fields tab
Fieldname	The name of the field.
Elementname	The name of the element in the XML file to use. Type: Type of the field can be either String, Date, or Number.
Type	Type of the field can be either String, Date, or Number. Format mask from which to convert.
Length	The length option depends on the field type follows: Number - Total number of significant figures in a number String - total length of string Date - length of printed output of the string (e.g. 4 only gives back year) Note: the output string is padded to this length if it is specified.
Precision	The precision option depends on the field type as follows: Number - Number of floating point digits String - unused Date - unused
Currency	Symbol used to represent currencies such as $10,000.00 or E5.000,00.
Decimal	A decimal point can be a "." (10,000.00) or "," (5.000,00).
Group	A grouping can be a "," (10,000.00) or "." (5.000,00).
Null	If the value of the field is null, insert this string into the textfile.
Get fields	Click to retrieve the list of fields from the input stream(s).
Minimal width	Alter the options in the fields tab in such a way that the resulting width of lines in the text file is minimal; for example instead of save 0000001, "1" is written, and so on. String fields are not padded to their specified length.

Excel Output

The Excel Output step allows you to write data to one or more Excel files.

Excel Output step fields and description (click to expand)

Field and description

Field	Description
File tab
Filename	The name of the spreadsheet file you are reading from.
Do not create file at start	Select if you don't want to create the file until the end of the step.
Extension	The three-letter file extension to append to the file name.
Include stepnr in filename	If you run the step in multiple copies (launching several copies of a step), the copy number is included in the file name, before the extension.
Include date in filename	Select if you want to include the system date in the filename
Include time in filename	Select if you want to include the system time (24-hour format) in the filename
Specify Date time format	Select if you want the filename will to include a date and time stamp that follows the selection you choose from the drop-down box. Selecting this option disables the previous two options.
Show file name(s)	Displays a list of the files that will be generated. This is a simulation and depends on the number of rows that will go into each file.
Add filenames to result	Select if you want to use the Filename field in constructing the result filename. If deselected, the Filename field is ignored.
Content tab
Append	Select if you want to append lines to the end of the specified file. If the file does not exist, a new one will be created.
Header	Select if you want a header to appear before the spreadsheet grid data.
Footer	Select if you want a footer to appear after the spreadsheet grid data.
Encoding	Specifies the text file encoding to use. Leave blank to use the default encoding on your system. To use Unicode, specify UTF-8 or UTF-16. On first use, Spoon searches your system for available encodings and populates this list accordingly.
Split every ... rows	Specify after how many rows, start a new spreadsheet file to continue data output.
Sheet name	Specifies the name of the worksheet within the spreadsheet file.
Product sheet?	Select if you want to enable password protection on the worksheet. You must also specify a password in the Password field.
Auto size columns	Select if you want to, automatically size the worksheet columns to the largest value.
Retain NULL values	Select if you want to preserve NULL values in the output. If deselected, NULLs are replaced with empty strings.
Use Template	Select if you want PDI to use the specified Excel template to create the output file. The template must be specified in the Excel template field.
Append to Excel	Select if you want append output to the specified Excel template.
Fields tab
Name	The name of the field.
Type	The field's data type; String, Date or Number.
Format	The format mask (number type).

Update

Update is similar to the Insert/Update step except that updates only are performed. No inserts are performed.

Update step fields and description (click to expand)

Field and description

Field	Description
Step name	Name of the step; this name has to be unique in a single transformation.
Connection	Specify the database connection to which data is written
Target schema	Specify the name of the Schema for the table to which data is written. This is important for data sources that allow for table names with periods in them.
Target table	Name of the table in which you want to do the insert or update.
Commit size	Specify the number of rows to change (insert / update) before running a commit.
Skip lookup	Select this option to update the target database without comparing values between target and source datasets.
Ignore lookup failure	Select this option to ignore failures in the step and process the data from the record stream (source dataset).
Key Lookup table	Allows you to specify a list of field values and comparators. You can use the following comparators: =, <>, <, <=, >, >=, LIKE, BETWEEN, IS NULL, IS NOT NULL Note: Click Get fields to retrieve a list of fields from the input stream(s).
Update Fields	Allows you to specify all fields in the table you want to update including the keys. Avoid updates on certain fields by specifying N in the update column. Note: Click Get Update fields to retrieve a list of update fields from the input stream(s).
SQL button	Click SQL to generate the SQL to create the table and indexes for correct operation.

AR Upsert

AR Upsert is a BMC provided step.The Checksum mechanism in the AR Upsert step compares the Checksum value of the source record and the target record to determine whether or not the source record needs to be imported.

ARUpsert step fields and description (click here to expand)

Field and description

Field	Description
General tab
Connection	Click the New button to create a new AR System server connection. Click the Edit button to edit an existing connection. Select an existing connection from the drop down menu.Click Wizard for a step-by-step guided procedure to create a new connection.
Form Name	Name of the AR System form where you output the data. This also works as a staging form when you provide the Alternate Target Form Name.
Use Checksum Checkbox	Performs checksum comparison in update mode.
Alternate Target Form Name	Specify when the target form is different from the staging form. When you provide the Alternate Target Form Name in an AR Upsert step, the operations are performed on the staging form. When you update any record, an entry is inserted into the staging form.
Batch Commit Size	(Optional) When you specify a Batch Commit Size, the bulk API import the data. You can specify the Batch Commit Size as variable. When you specify the Batch Commit Size as a variable, the AR Upsert step uses the actual value provided. Best Practice: When the source data has large number of duplicate records, BMC recommends either of the following: Use step that has the filtering capability, such as Filtering unique records, which filters the duplicate records before calling AR Upsert step or AROutput step. Keep the batch size smaller that can be compared with the unique records.
Fallback to single row commit if batch commit fails Checkbox	Use this option to switch to a single-row commit if batch-commit fails.
Insert Only	Creates a new record if an entry does not exist in AR System form. Skips the record, if an entry already exists in an AR System form. The Insert Only mode ignores the Use Checksum flag and adds a new entry.
Update Record	Performs an update as well as insert operation, based on the Checksum conditions.
Match By Request ID	Select to match output rows by Request ID. When you select the Match by Request ID checkbox, a new entry is created when the Request ID is not provided for the input row.
Multi Match Option	Select when more than one record can matche the given qualification.
Configure Matching Qualification	Select when more than one record can matche the given qualification.

Transform

Add Constants

Adds constant values to the specified field in the destination data source. For example, Calbro Services can add Disk drive as the Short Description value when transferring disc drive information from their SQL table to the BMC_Disc Drive CI class.

Add Constants step fields and description (click to expand)

Field and description

Field	Description
Step name	Name of the step; this name has to be unique in a single transformation.
Fields	Specify the name, type, and value in the form of a string. Then, specify the formats to convert the value into the chosen data type.

Select Values

This Select values step is useful for selecting, renaming, changing data types and configuring the length and precision of the fields on the stream. These operations are organized into different categories:

Select & Alter - Specify the exact order and name in which the fields have to be placed in the output rows
Remove - Specify the fields that have to be removed from the output rows
Meta-data - Change the name, type, length and precision (the meta-data) of one or more field

Select Values step fields and description (click to expand)

Field and description

Field	Description
Select & Alter tab
Step name	Name of the step; this name has to be unique in a single transformation.
Fields	Allows you to rename a field and specify the length and precision.
Get fields to select	Click to insert fields from all input steams to the step.
Edit Mapping	Click to open a mapping dialog to easily define multiple mappings between source and target fields. Note: Works if there is only one target output step.
Include unspecified fields, ordered by name	Select if you want to implicitly select all other fields from the input stream(s) that are not explicitly selected in the Fields section
Remove tab	This tab allows you to enter the fields that you want removed from the stream. You can also click Get fields to remove to add all fields from the input stream(s). This makes it easier if you are trying to remove several fields. After getting all fields, delete any of the fields that you do not want remove from the stream.
Field name
Get fields to remove	Click to add all fields from the input stream(s).
Meta-data tab	This tab allows you to rename, change data types, and change the length and precision of fields coming into the Select Values step. Click Get fields to import fields from previous steps. A lot of data type conversions are also possible with this tab.
Fields to alter the meta data for	Rename input fields, convert them to different data types, and alter the field length and precision.
Get fields to change	Not applicable

Sort Rows

The Sort rows step sorts rows based on the fields you specify and on whether they should be sorted in ascending or descending order.

Notes:

Kettle has to sort rows using temporary files when the number of rows exceeds the specified sort size (default 1 million rows). When you get an out of memory exception (OOME), you need to lower this size of change your available memory.
When you use multiple copies of the step in parallel (on the local JVM with "Change number of copies to start" or in a clustered environment using Carte) each of the sorted blocks need to be merged together to ensure the proper sort sequence. This can be done, be adding the Sorted Merge step afterwards (on the local JVM without multiple copies to start or in the cluster on the master).

Sort Rows step fields and description (click to expand)

Field and description

Field	Description
Step name	Name of the step; this name has to be unique in a single transformation.
Sort directory	The directory in which the temporary files are stored in case when needed; the default is the standard temporary directory for the system
TMP-file prefix	Choose an easily recognized prefix so you can identify the files when they show up in the temp directory.
Sort size (rows in memory)	Specify the number of rows to be allotted in database memory for sorting tasks.
Free memory threshold (in %)	If the sort algorithm finds that it has less available free memory than the indicated number, it will start to page data to disk. Note: This is not exact, because: This is checked every 1000 rows. Depending on the row size and other steps within complex transformations this could still lead to an OutOfMemoryError. In a Java Virtual Machine it's not possible to know the exact amount of free memory. As such BMC recommends you don't use this for very complex transformations with other steps and processes that use up a lot of memory.
Compress TMP Files	Select if you want to compress temporary files when they are needed to complete the sort.
Only pass unique rows?	Select if you want to pass unique rows only to the output stream(s).
Fields table	Specify the fields and direction (ascending/descending) to sort. You can specify whether to perform a case sensitive sort (optional)
Get Fields	Click to retrieve a list of all fields coming in on the stream(s).

Calculator

Provides you with predefined functions that can be executed on input field values. For example, Calbro Services can calculate the total cost of the computer systems in all their sites and add that information to BMC Helix CMDB.

For more information, see the Using the Calculator step video. This video is recorded using the earlier version of BMC Atrium Core and is valid for BMC Helix CMDB 9.1.

Scripting

Modified Java
Script Value

Provides an expression-based user interface for building JavaScript expressions. Allows you to create multiple scripts for each step. For example, Calbro Services can create an if/then/else statement to identify the operating system of a computer based on its serial number. The script can also include the concatenate function, which creates a work group by appending the virtual machine type to the operating system.

Modified Java Script Value step fields and description (click to expand)

Field and description

Field	Description
Step name	Name of the step; this name has to be unique in a single transformation.
Java script functions	This section provides a tree view of your available scripts, functions, input fields and output fields. Transformation Scripts: displays a list of scripts you have created in this step Transformation Constants: a list of pre-defined, static constants including SKIP_TRANSFORMATION, ERROR_TRANSFORMATION, and CONTINUE_TRANSFORMATION Transformation Functions: contains a variety of String, Numeric, Date, Logic and specialized functions you can use to create your script. To add a function to your script, simply double-click on the function or drag it to the location in your script that you wish to insert it. Input Fields: a list of inputs coming into the step. Double-click or use drag and drop to insert the field into your script. Output Fields: a list of outputs for the step.
Java Script	This section is where you edit the script for this step. You can insert functions, constants, input fields, etc. from the tree control on the left by double-clicking on the node you wish to insert or by dragging the object onto the Java Script panel.
Fields	The Fields table contains a list of variables from your script including the ability to add metadata like a descriptive name.
Get Variables	Click to retrieve a list of variables from your script.
Test script	Click to test the syntax of your script.

Flow

Filter rows

Allows you to filter rows based on conditions and comparisons. After this step is connected to a previous step, you can click the "<field>", "=" and "<value>" areas to construct a condition.

Filter rows step fields and description (click to expand)

Field and description

Field	Description
Step name	Name of the step; this name has to be unique in a single transformation.
Send 'true' data to step	The rows for which the condition specified is true are sent to this step.
Send 'false' data to step	The rows for which the condition specified are false are sent to this step
The Condition	Click NOT to negate the condition. Click <Field> to select from a list of fields from the input stream(s) to build your condition(s). Click <value> button to enter a specific value into your condition(s). To delete a condition, right-click and select Delete Condition.
Add Condition	Click to add a condition

Switch/Case

Implements the Switch/Case statement found in popular programming languages like Java. Pentaho routes rows of data to one or more target steps based on the value encountered in a certain field.

Switch/Case step fields and description (click to expand)

Field and description

Field	Description
Step name	Name of the step; this name has to be unique in a single transformation.
Field name to switch	Specify the field name that contains the value to use as a bases for the row routing.
Use string contains	Select this option to match any portion of the text available in the fields that are being tested. of the values in Clear this option to exactly match the field values.
Case value data type	Specify the data type of the values specified in this dialog
Case value conversion mask	Specify the conversion mask of the values specified in this dialog (numeric / date values)
Case value decimal symbol	Specify the decimal symbol of the values specified in this dialog (numeric values)
Case value grouping symbol	Specify the grouping symbol of the values specified in this dialog (numeric values)
Case values	Here you can specify a value-target step pair, one per row. To specify a null value, simply leave the value column blank while entering a target step. Please note that it is possible to specify the same target step more than once.
Default target step	Select all the rows that don't match any of the case values above are sent to this target step.

Identify last row in a
stream

Lookup

CMDBLookUp

This is a BMC provided step. Allows you to look up values in the BMC Helix CMDB based on the primary key that you specified. Lookup values are added as new fields onto the stream.

CMDB Lookup step fields and description (click to expand)

Field and description

Field	Description
Step name	Name of the step; this name has to be unique in a single transformation.
Connection	Specify the database connection to the lookup table.
ClassName	Display the required CMDB form name
Dataset ID	Select the required DatasetId from a list of IDs that you want to be looked up.
Enable cache?	Select if you want to enable caching of database lookups. This means that once a key (or group of key) has been looked up, the looked up values are stored, and returned again the next time this key (or group of key) is being looked up (without incurring the cost of a database call).
Cache size on rows	Specify the size of the cache (number of rows).
Load all data from table	Select if you want to pre-load the cache with all the data present in the lookup table. This can improve performance by avoiding database calls. However, if you have a large table, there is a risk of running out of memory. Important In memory look-ups can lead to different results because of the differences in the way your database compares data. For example, if your database table allows case-insensitive comparisons, you may get different results with this option. The same goes for data with trailing spaces. Those are ignored in certain databases when using certain character data types.
The key(s) to lookup the value(s)	The keys and conditions to perform the database lookup.
Values to return from the lookup table	The fields from the lookup table to add to the output stream. New name: allows you to use a different name if the database column name is inappropriate. Default: is the value returned (instead of null) if the lookup fails. Note that enabling error handling on the Database Lookup step will redirect all failed lookup rows to the error handling step, instead of adding them to the main output steam with null/default values. Type: is the type of the output field.
Do not pass rows if lookup fails	Select to avoid passing a row when lookup fails.
Fail on multiple results	Select to force the step to fail if the lookup returns multiple results.
Order by	If the lookup query returns multiple results, the ORDER BY clause helps you to select the record to take. For example, ORDER BY allows you to pick the customer with the highest sales volume in a specified state.
Get fields	Click to fetch all the available fields in this step.
Get lookup fields	Click to fetch all the lookup fields in this step.

Database Lookup

Looks up values in a database table. Lookup values are added as new fields onto the stream.

Database Lookup step fields and description (click to expand)

Field and description

Field	Description
Step name	Name of the step; this name has to be unique in a single transformation.
Connection	Specify the database connection to the lookup table.
Lookup Schema	Specify the database schema containing the lookup table.
Lookup table	Name of the database table used for the lookup.
Enable cache?	Select if you want to enable caching of database lookups. This means that once a key (or group of key) has been looked up, the looked up values are stored, and returned again the next time this key (or group of key) is being looked up (without incurring the cost of a database call).
Cache size on rows	Specify the size of the cache (number of rows).
Load all data from table	Pre-loads the cache with all the data present in the lookup table. This can improve performance by avoiding database calls. However, if you have a large table, there is a risk of running out of memory. Important In memory look-ups can lead to different results because of the differences in the way your database compares data. For example, if your database table allows case-insensitive comparisons, you may get different results with this option. The same goes for data with trailing spaces. Those are ignored in certain databases when using certain character data types.
The key(s) to lookup the value(s)	The keys and conditions to perform the database lookup.
Values to return from the lookup table	The fields from the lookup table to add to the output stream. New name: allows you to use a different name if the database column name is inappropriate. Default: is the value returned (instead of null) if the lookup fails. Note that enabling error handling on the Database Lookup step will redirect all failed lookup rows to the error handling step, instead of adding them to the main output steam with null/default values. Type: is the type of the output field.
Do not pass rows if lookup fails	Select to avoid passing a row when lookup fails.
Fail on multiple results	Select to force the step to fail if the lookup returns multiple results.
Order by	If the lookup query returns multiple results, the ORDER BY clause helps you to select the record to take. For example, ORDER BY allows you to pick the customer with the highest sales volume in a specified state.
Get fields	Click to return a list of available fields from the input stream(s) of the step.
Get lookup fields	Click to return a list of available fields from the lookup table that can be added to the step's output stream.

Jobs

Get variables

Allows you to get the value of a variable. This step can return rows or add values to input rows.

Set variables

Allows you to set variables in a job or in the virtual machine. It accepts one (and only one) row of data to set the value of a variable.

Steps in Atrium Integrator transformation

To access the Atrium Integrator Spoon

To add a transformation or a job

On this page