This documentation supports the 22.1 version of BMC Helix CMDB, which is available only to BMC Helix customers (SaaS).

To view an earlier version, select the version from the Product version menu.

Steps in Atrium Integrator transformation

You can add intermediate steps to a transformation in a job that you create through the Integration Job Builder wizard, or to a custom job you create through Atrium Integrator Spoon.

The following table lists the steps that you can add to your transformation to manipulate the data before adding it to BMC Helix CMDB

BMC supports only the steps listed in the table. For information about additional steps that you can add to your transformation, see the Pentaho online documentation.

Any queries around complex customization using unsupported steps will be addressed selectively and at discretion of BMC. See the topic Atrium Integrator Spoon support in Atrium Integrator Spoon.


The following video shows an overview of how to create a transformation and job in Atrium Integrator Spoon, and manage the Spoon job from the Atrium Integrator console in CMDB Portal:

https://youtu.be/xKMKI77lfVk

To access the Atrium Integrator Spoon

  1. Log in to the computer on which Atrium Integrator Spoon is installed.
  2. To open Atrium Integrator Spoon, open the Spoon.bat file at the following location:
    <installationDirectory>\diserver\data-integration\

To add a transformation or a job 

  1. Select Job or Transformation.
     

  2. Click the step that you want to add and drag it to the canvas.

The following table lists the intermediate steps for transformations and the descriptions:

StepWhat it helps you to do
Input

ARInput

ARInput is a BMC provided step. It allows you to migrate AR System data into BMC Helix CMDB.

Field and description

FieldDescription
General tab
ConnectionSpecify the AR System server name.
Form NameSpecify the form name from where you want fetch the AR System data.
Chunk SizeSpecify the number of records you want to pull in one thread.
Qualification tab
Qualification StringSpecify the qualification value.
Fields tab
Field listDisplays the AR System field details. Click Get Fields to populate all the available AR System fields.

CMDBInput

CMDBInput is a BMC provided step. It allows you to migrate data from the BMC Helix CMDB on your development server to the BMC Helix CMDB on your production server.

To understand how the CMDBInput step is used see, Migrating data from one BMC Helix CMDB server to another.

Field and description

FieldDescription
ConnectionSpecify the AR System server name.
Class NameSpecify the CMDB form name from where you want to read the data.
NameSpaceSpecify the BMC namespace. For example BMC.Core
Dataset IdSpecify the BMC dataset ID. For example, BMC.ASSET
Attribute ListDisplays the list of attributes that belong to the selected class. Click Get Attributes to populate the list.
To Filter Criteria Select Values of Attributes

Define the filter criteria to compare CMDB fields to qualifications value.

Selected QualificationsDisplays the list of qualification strings. For example, in AND 'MarkAsDeleted' = NULL string, AND is the join condition, MarkAsDeleted is the field value and NULL is the qualification value .

ARX File Input

ARX File Input is a BMC provided step. It allows you to migrate data from ARX files into BMC Helix CMDB.

Field and description

FieldDescription
File NameSpecify fieldname that will contain the filename(s) to read from. 
Chunk SizeSpecify number of records you want to pull in one thread.
Schema NamesAR form name that the arx file uses.
ConnectionSpecify the AR server name.
Fields tableList of arx data fields. The data is pushed in the displayed fields only.
Preview Rows buttonClick to view the data in the fields.

Table Input

Table input step allows you to read information from a database, using a connection and SQL. Basic SQL statements are generated automatically.

BMC supports selective vendor databases which include – IBM DB2, MS SQL Server, Oracle, Sybase, and My SQL. 

To understand how Table input step is used, see Troubleshooting issues with Atrium Integrator migrated jobs.

For information about adding qualifications to the Table Input step, see the following video:

https://youtu.be/DMpTugESxjM.

This video is recorded using the earlier version of BMC Atrium Core and is valid for BMC Helix CMDB 9.1 and above.

Field and description

FieldDescription
ConnectionThe database connection from which to read data.
SQL

The SQL statement used to read information from the database connection. You can also click Get SQL select statement... to browse tables and automatically generate a basic select statement.

Enable lazy conversionSelect if you want to avoid unnecessary data type conversions. Enabling this option results in a significant performance improvements.
Replace variables in script?Select if you want to replace variables in the script. This option allows you to test with or without performing variable substitutions.
Insert data from stepInput step name where Pentaho can expect information to come from. This information can then be inserted into the SQL statement. The locators where Pentaho inserts information is indicated by ? (question marks).
Execute for each row?Select to perform the data insert for each individual row.
Limit sizeSets the number of lines that is read from the database. Zero (0) means read all lines.

CSV File Input

CSV File input step allows you to migrate data from CSV files into BMC Helix CMDB.

Field and description

FieldDescription
FilenameSelect the fieldname that will contain the filename(s) to read from. 
DelimiterSpecify the file delimiter character used in the target file. Special characters (e.g. CHAR HEX01) can be set with the format $[value], e.g. $[01] or $[6F,FF,00,1F].
EnclosureSpecify the enclosure character used in the target file. Special characters (e.g. CHAR HEX01) can be set with the format $[value], e.g. $[01] or $[6F,FF,00,1F].
NIO buffer sizeThis is the size of the read buffer.  It represents the amount of bytes that is read in one time from disk. 
Lazy conversionThe lazy conversion algorithm tries to avoid unnecessary data type conversions and can result in a significant performance improvements if this is possible.  For example, reading from a text file and writing back to a text file.
Header row present?Select this option if the target file contains a header row containing column names.
Add filename to resultAdds the CSV filename(s) to the result of this transformation.  A unique list is being kept in memory that can be used in the next job entry in a job, for example in another transformation.
The row number field name (optional)The name of the Integer field that will contain the row number in the output of this step. 
Running in parallel?

Select if you have multiple instances of this step running (step copies) and if you want each instance to read a separate part of the CSV file(s).
When reading multiple files, the total size of all files is taken into consideration to split the workload. And make sure that ALL step copies receive all files that need to be read, otherwise, the parallel algorithm will not work correctly.
WARNING:

For technical reasons, parallel reading of CSV files is only supported on files that don't have fields with line breaks or carriage returns in them.

File encodingSpecify the encoding of the file being read.
Fields TableDisplays an ordered list of fields to be read from the target file.
Preview buttonClick to preview the data coming from the target file.
Get Fields buttonClick to return a list of fields from the target file based on the current settings (i.e. Delimiter, Enclosure, etc.). All fields identified will be added to the Fields Table.

Text File Input

The Text File Input step allows you to migrate data from a variety of different text-file types.

The Text File Input step provides you with the ability to specify a list of files to read, or a list of directories with wild cards in the form of regular expressions. In addition, you can accept filenames from a previous step making filename handling more even more generic.

Field and description

FieldDescription
File tab
File or directory

This field specifies the location and/or name of the input text file.

Click Add to add the file/directory/wildcard combination to the list of selected files (grid).

Regular ExpressionSpecify the regular expression you want to use to select the files in the directory specified in the previous option. For example, you want to process all files that have a .txt extension.
Exclude Regular ExpressionSpecify the regular expressions you want to exclude.
Selected files tableDisplays a list of selected files (or wildcard selections) along with a property specifying if file is required or not. If a file is required and it isn't found, an error is generated. Otherwise, the filename is skipped.
Accept filenames from previous step

This option allows even more flexibility in combination with other steps such as "Get Filenames". You can construct your filename and pass it to this step. This way the filename can come from any source: text file, database table, etc.

Select this option to get filenames from previous steps.

Pass through fields from previous stepSelect if you want to pass through fields from previous step.
Step to read filenames fromSpecify the step from which to read the filenames.
Field in input to use as filenameText File Input looks in this step to determine which filenames to use.
Show filenames(s)... 

Displays a list of all files that will be loaded based on the current selected file definitions.

Show file contentDisplays the content of the selected file.
Show content from first data lineDisplays the content from the first data line only for the selected file.
Content tabThe content tab enables you to specify the format of the text files that are being read.
File typeCan be either CSV or Fixed length. Based on this selection, Atrium Integrator Spoon launches a different helper GUI when you click the Get fields button in the Fields tab.
Separator

Specify one or more characters that separate the fields in a single line of text. Typically this is ; or a tab. Special characters (e.g. CHAR ASCII HEX01) can be set with the format $[value], for example, $[01] or $[6F,FF,00,1F].

EnclosureSome fields can be enclosed by a pair of strings to allow separator characters in fields. The enclosure string is optional. If you use repeat an enclosures allow text line 'Not the nine o''clock news.'. With ' the enclosure string, this gets parsed as Not the nine o'clock news. Special characters (e.g. CHAR ASCII HEX01) can be set with the format $[value], for example, $[01] or $[6F,FF,00,1F].
EscapeSpecify an escape character (or characters) if you have these types of characters in your data. If you have \ as an escape character, the text 'Not the nine o\'clock news' (with ' the enclosure) gets parsed as Not the nine o'clock news. Special characters (e.g. CHAR HEX01) can be set with the format $[value], for example, $[01] or $[6F,FF,00,1F].
HeaderSelect if your text file has a header row (first lines in the file). You can specify the number of times the header lines appears.
FooterSelect if your text file has a footer row (last lines in the file). You can specify the number of times the footer row appears.
Wrapped lines

Use if you deal with data lines that have wrapped beyond a specific page limit.

Note:

Headers and footers are never considered wrapped.

Page layoutUse these options as a last resort when dealing with texts meant for printing on a line printer; use the number of document header lines to skip introductory texts and the number of lines per page to position the data lines.
Compression

Select if your text file is placed in a Zip or GZip archive.

Note:

Currently, only the first file in the archive is read.

No empty rowsSelect if you do not send empty rows to the next steps.
Include filename in outputSelect if you want the filename to be part of the output.
Rownum in outputSelect if you want the row number to be part of the output
FormatCan be either DOS, UNIX or mixed. UNIX files have lines that are terminated by line feeds. DOS files have lines separated by carriage returns and line feeds. If you specify mixed, no verification is done.
EncodingSpecify the text file encoding to use; leave blank to use the default encoding on your system. To use Unicode, specify UTF-8 or UTF-16. On first use, Spoon searches your system for available encodings.
Limit

Sets the number of lines that is read from the file; zero (0) means read all lines.

Be lenient when parsing datesDeselect, if you want strict parsing of data fields. If case-lenient parsing is selected, dates like Jan 32nd will become Feb 1st.
The date format localeThis locale is used to parse dates that have been written in full such as "February 2nd, 2006;" parsing this date on a system running in the French (fr_FR) locale would not work because February is called Février in that locale.
Add filenames to resultSelect to add output file names as a field in the results.
Error Handling tabThe error handling tab allows you to specify how the step reacts when errors occur.
Ignore errorsSelect if you want to ignore errors during parsing.
Skip error linesSelect if you want to skip those lines that contain errors. You can generate an extra file that contains the line numbers on which the errors occurred. Lines with errors are not skipped, the fields that have parsing errors, will be empty (null).
Error count field nameAdd a field to the output stream rows. This field contains the number of errors on the line.
Error fields field name

Add a field to the output stream rows. This field contains the field names on which an error occurred.

Error text field name

Add a field to the output stream rows. T

This field contains the descriptions of the parsing errors that have occurred.

Warning files directoryWhen warnings are generated, they are placed in this directory. The name of that file is <warning dir>/filename.<date_time>.<warning extension>.
Error files directoryWhen errors occur, they are placed in this directory. The name of the file is <errorfile_dir>/filename.<date_time>.<errorfile_extension>.
Failing line numbers file directoryWhen a parsing error occurs on a line, the line number is placed in this directory. The name of that file is <errorline dir>/filename.<date_time>.<errorline extension>.
Filters tabThe filters tab provides you with the ability to specify the lines you want to skip in the text file.
Filter stringSpecify the string for which to search.
Filter positionSpecify the position where the filter string has to be at in the line. Zero (0) is the first position in the line. If you specify a value below zero (0), the filter string is searched for in the entire string.
Stop on filterSpecify Y if you want to stop processing the current text file when the filter string is encountered.
Positive matchIncludes the rows where the filter condition is found (include). The alternative is that those rows are avoided (exclude).
Fields tabThe fields tab allows you to specify the information about the name and format of the fields being read from the text file.
NameName of the field.
TypeType of the field can be either String, Date or Number.
Format

Select the format in which you want the Type of the field  

For example, if you select Date in the Type column, you can select a format from the Format list. 

Length
  • For Number: Total number of significant figures in a number.
  • For String: total length of string.
  • For Date: length of printed output of the string (e.g. 4 only gives back the year).
Precision
  • For Number: Number of floating point digits.
  • For String, Date, Boolean: unused.
CurrencyUsed to interpret numbers like $10,000.00 or E5.000,00
DecimalA decimal point can be a "." (10;000.00) or "," (5.000,00)
GroupA grouping can be a dot "," (10;000.00) or "." (5.000,00)
Null IfTreat this value as NULL.
DefaultDefault value in case the field in the text file was not specified (empty).
Trim typeType trim this field (left, right, both) before processing.
RepeatIf the corresponding value in this row is empty, repeat the one from the last time it was not empty.

Get Data from XML

Get Data From XML allows you to read data from any type of XML file using XPath specifications. Get Data From XML step can read data from 3 kind of sources (files, stream and url) in 2 modes (user can define files and urls at static mode or in a dynamic way). 

Field and description

FieldDescription
File tabThe Files tab is where you define the location of the XML files from which you want to read.
XML source is defined in a fieldSelect if the previous step is giving XML data in a certain field in the input stream.
XML source is file nameSelect if the previous step is giving filenames in a certain field in the input stream.  These are read.
Read source as UrlSelect if the  previous step is giving URLs in a certain field in the input stream.  These are read.
Get XML source from a fieldSpecify the field to read XML, filename or URL from.
File or directorySpecify the location and/or name of the input text file. Note: Click Add to add the file/directory/wildcard combination to the list of selected files (grid) below.
Regular ExpressionSpecify the regular expression you want to use to select the files in the directory specified in the previous option.
Exclude Regular ExpressionSpecify the regular expressions you want to exclude.
Selected FilesFiles from where the data is pulled.
Show filename(s)Contains a list of selected files (or wildcard selections) and a property specifying if file is required or not. If a file is required and it is not found, an error is generated;otherwise, the file name is skipped.
Content tab
Loop XPathFor every "Loop XPath" location we find in the XML file(s), we will output one row of data.  This is the main specification we use to flatten the XML file(s).  Click Get XPath nodes to search for the possible repeating nodes in the XML document.  Please note that if the XML document is large that this can take a while.
EncodingThe XML filename encoding in case none is specified in the XML documents.
Namespace awareSelect to make the XML document namespace aware.
Ignore commentsIgnore all comments in the XML document while parsing.
Validate XMLValidate the XML prior to parsing. Use a token when you want to replace dynamically in a Xpath field value. A token is between @_ and - (@_fieldname-). Please see the Example 1 to see how it works.
Use tokenA token is not related tro XML parsing but to PDI.
Ignore empty fileAn empty file is not a valid XML document.  Select this if you want to ignore those altogether.
Do not raise an error if no filesSelect if you don't want to raise an error if no files are found.
LimitLimits the number of rows to this number (zero (0) means all rows).
Prune path to handle large filesAlmost the same value as the "Loop XPath" property with some exceptions.
Include filename in output?Select if you want to specify a field name to include the file name (String) in the output of this step.
Rownum in output?Select if you want to specify a field name to include the row number (Integer) in the output of this step.
Add files to result filesnameAdds the XML filenames read to the result of this transformation.  A unique list is being kept in memory that can be used in the next job entry in a job, for example in another transformation.
Fields tab
NameThe name of the output field.
XPathThe path to the element node or attribute to read 
ElementThe element type to read: Node or Attribute 
TypeThe data type to convert to 
FormatThe format or conversion mask to use in the data type conversion 
LengthThe length of the output data type 
PrecisionThe precision of the output data type
CurrencyThe currency symbol to use during data type conversion 
DecimalThe numeric decimal symbol to use during data type conversion
GroupThe numeric grouping symbol to use during data type conversion
Trim typeThe type of trimming to use during data type conversion
RepeatRepeat the column value of the previous row if the column value is empty (null)

Excel Input

Excel Input step enables you to migrate data from Excel files into BMC Helix CMDB.

Note

When you provide input as Time datatype in a Microsoft Excel Input step, the time is not updated in Spoon because Pentaho does not support the Time datatype.

Workaround: Specify time inputs as strings and change the format to HH:mm:ss, and then run the transformation.

An example is shown in the following figure:



Field and description

FieldDescription
Files tab
File or directorySpecify the file name from where you want to pull the data.
Regular ExpressionSpecify the regular expression you want to use to select the files in the directory specified in the previous option.
Exclude Regular Expression

Specify the regular expressions you want to exclude.


Selected FilesA list of files that will be used in this step, according to the criteria specified in the previous fields.
Accept filenames from previous stepSelect to retrieve a list of filenames from the previous step in this transformation. You must also specify which step you are importing from, and the input field in that step from which you will retrieve the filename data. If you choose this option, the Show filename(s) option will show a preview of the list of filenames.
Sheets tab
List of sheets to readA list of worksheets that you want to use. If this remains empty, all worksheets in all specified files will be selected. Rows and columns are numbered, starting with 0.
Get sheetname(s)Click to retrieve a list of worksheets from all of the specified files and give you the option to select some or all of them for this step.
Content tab
HeaderSelect if there is a header row to skip in the selected worksheets.
No empty rowSelect if you want to remove empty rows from the output.
Stop on empty rowSelect if you want to stop reading from the current worksheet when an empty row is read.
LimitSets a static number of rows to read. If set to 0, there is no set limit.
EncodingSpecifies the text file encoding to use. Leave blank to use the default encoding on your system. To use Unicode, specify UTF-8 or UTF-16. On first use, Spoon searches your system for available encodings.
Spread sheet type (engine)Specifies which spreadsheet format to expect from the file, regardless of its extension.
Add filenames to resultSelect to pass the input filenames to the output.
Error Handling tab
Strict types?Select if you want PDI to report data type errors in the input.
Ignore errors?Select if you don't want no errors to be reported during input parsing.
Skip error lines?Select if you want PDI to skip lines that contain errors. These lines can be dumped to a separate file by specifying a path in the Failing line numbers files directory field below. If this is not selected, lines with errors appear as NULL values in the output.
Warning files directoryDirectory in which to create files that contain warning messages regarding input values for each spreadsheet file read. These files will have the extension you specify here.
Error files directoryDirectory in which to create files that contain error messages regarding input values for each spreadsheet file read. These files will have the extension you specify here.
Failing line numbers files directoryDirectory in which to create files that contain the lines that failed error checks during input validation. These files will have the extension you specify here.
Fields tab
NameThe name of the field.
TypeThe field's data type; String, Date or Number.
Length

The length option depends on the field type. 

  • Number: total number of significant figures in a number.
  • String: total length of a string.
  • Date: determines how much of the date string is printed or recorded.
PrecisionThe precision option depends on the field type, but only Number is supported; it returns the number of floating point digits.
Trim typeTruncates the field (left, right, both) before processing. Useful for fields that have no static length.
RepeatIf set to Y, will repeat this value if the next field is empty.
FormatThe format mask (number type).
CurrencySymbol used to represent currencies.
DecimalA decimal point; this is either a dot or a comma.
GroupingA method of separating units of thousands in numbers of four digits or larger. This is either a dot or a comma.
Additional output fields tab
Full filename field

The full file name including the file path and the file extension. 

For example, C:\Users\Administrator\Desktop\INC.xls 

Sheetname fieldThe worksheet name you're using.
Sheet row nr fieldThe current sheet row number.
Row nr written fieldNumber of rows written.
Short filename field

Only the file name. 

For example, INC.xls 

Extension fieldThe three- or four-letter file type extension.
Path field

Only the file path.

For example, C:\Users\Administrator\Desktop

Output

AROutput

AROutput is a BMC provided step. It allows you to export data onto AR system.

Field and description

FieldDescription
General tab
Step nameName of the step; this name has to be unique in a single transformation.
ConnectionSpecify the the database connection to which data is written.
Form NameSpecify the AR System form name where you want to push the data.
Batch Commit SizeSpecify the amount of data you want to commit in one batch.
Fallback to single row commit if batch commit failsSelect if you want to proceed with single row commit if batch commit fails.
Duplicates tab
Duplicate Request ActionSelect the data duplication conditions.
Match By Request IDSelect if you want to match the data and update by request ID.
Data Handling tab
Require Required FieldsIs selected by default if you select Match by Request ID.
Enforce Pattern MatchingIs selected by default if you select Match by Request ID.
Field Mapping tab
Merge FieldsUse to map the form field to stream field.

CMDBOutput

CMDBOutput is a BMC provided step. It allows you to load data into BMC Helix CMDB.

Field and description

FieldDescription
Step nameName of the step; this name has to be unique in a single transformation.
ConnectionThe database connection to which data is written.
Class NameSpecify the CMDB form name from where you want to push the data.
NameSpaceSpecify the BMC namespace. For example BMC.Core
Dataset IDSpecify the BMC dataset ID. For example, BMC.ASSET
CMI Commit SizeNot applicable.
Qualification String

Not applicable.

Use The key(s) to look up the value(s) section instead.  

Only insert new CIs

Select if you want to update only new CIs.
Always insert CIsSelect if you want to transfer all source records to BMC Helix CMDB without checking for duplicates. See, Inserting or updating bulk records in BMC Helix CMDB.
Use CacheSelect if you want to insert or update bulk records in BMC Helix CMDB. See, Increasing cache size for Atrium Integrator.
Use CheckSumSelect if you want to use the CheckSum feature to update records in BMC Helix CMDB. See, Configuring the checksum value for loading data into BMC Helix CMDB.
The key(s) to lookup the value(s)Define the qualification to find the matching data.
Update fieldsDefine if the input data should update the existing data.
Edit MappingDefine the mapping between source and target fields.

Table Output

The Table output step allows you to load data into a database table.

BMC supports selective vendor databases which include – IBM DB2, MS SQL Server, Oracle, Sybase, and My SQL. 

Field and description

FieldDescription
Step nameName of the step; this name has to be unique in a single transformation.
ConnectionSpecify the database connection to which data is written.
Target SchemaSpecify the name of the Schema for the table to write data to. This is important for data sources that allow for table names with periods in them.
Target tableSpecify the name of the table to which data is written.
Commit size

Use transactions to insert rows in the database table. Commit the connection every N rows if N is larger than zero (0); otherwise, don't use transactions. (Slower).

Note: Transactions are not supported on all database platforms.

Truncate table

Select if you want the table to be truncated before the first row is inserted into the table.

Note: Don't use this option when you are running the transformation clustered.

Main options tab
Partition data over tablesUse to split the data over multiple tables.
Use batch updates for insertsSelect if you want to use batch inserts. This feature groups inserts statements to limit round trips to the database. This is the fastest option and is enabled by default. This feature is not available under these conditions: If the transformation database is transactional, if you are using Greenplum or PostgreSQL with error handling turned on, if you are using auto-generated keys.
Is the name of the table defined in the field?Use these options to split the data over one or more tables; the name of the target table is defined in the field you specify. For example if you store customer data in the field gender, the data might end up in tables M and F (Male and Female). There is an option to exclude the field containing the tablename from being inserted into the tables.
Return auto-generated keySelect if you want to get back the key that was generated by inserting a row into the table.
Name of auto-generated key fieldSpecifies the name of the new field in the output rows that contains the auto-generated key.
SQLGenerates the SQL to create the output table automatically.
Database fields tab
Fields to insertEnter the Source (Stream) and Table (Target or Existing data) Field name.
Enter field mappingDefine the mapping between source and target fields.

Text File Output

The Text file output step is used to export data to text file format. This is commonly used to generate Comma Separated Values (CSV files) that can be read by spreadsheet applications.  It is also possible to generate fixed width files by setting lengths on the fields in the fields tab.

Field and description

FieldDescription
File tabThe File tab is where you define basic properties about the file being created.
Step nameName of the step; this name has to be unique in a single transformation.
FilenameSpecify the filename and location of the output text file.
Run this as a command instead?Select if you want to "pipe" the results into the command or script you specify.
Do not create file at start?Select if you don't want to create the file until the end of the step.
ExtensionAdds a point and the extension to the end of the filename. (.txt)
Accept file name from field? Select if you want to specify the filename(s) in a field in the input stream. 
File name field When the previous option is enabled, you can specify the field that will contain the filename(s) at runtime. 
Include stepnr in filenameIf you run the step in multiple copies (Launching several copies of a step), the copy number is included in the filename, before the extension.
Include partition nr in filename?Select if you want to includes the data partition number in the filename.
Include date in filenameSelect if you want to includes the system date in the filename.
Include time in filenameSelect if you want to includes the system time in the filename.
Specify date formatSelect if you want the filename to include a date and time stamp that follows the selection you choose from the drop-down box. Selecting this option disables the previous two options.
Show filename(s)This option shows a list of the files that will be generated. 
Note: This is a simulation and among others depends on the number of rows that will go into each file.
Add filenames to resultSelect if you want to use the Filename field in constructing the result filename. If deselected, the Filename field is ignored.
Content tab
AppendSelect if you want to append lines to the end of the specified file.
SeparatorSpecify the character that separates the fields in a single line of text. Typically this is ; or a tab.
EnclosureA pair of strings can enclose some fields. This allows separator characters in fields. The enclosure string is optional.
Force the enclosure around fields?Select if you want to force all field names to be enclosed with the character specified in the Enclosure property above.
Disable the enclosure fix?This is for backward compatibility reasons (since version 4.1) related to enclosures and separators. The logic since version 4.1 is: When a string field contains an enclosure it gets enclosed and the enclose itself gets escaped. When a string field contains a separator, it gets enclosed. Check this option, if this logic is not wanted. It has also an extra performance burden since the strings are scanned for enclosures and separators. So when you are sure there is no such logic needed since your strings don't have these characters in there and you want to improve performance, un-check this option. 
HeaderSelect if you want the text file to have a header row. (First line in the file).
FooterSelect if you want the text file to have a footer row. (Last line in the file).
FormatThis can be either DOS or UNIX. UNIX files have lines are separated by linefeeds. DOS files have lines separated by carriage returns and line feeds. 
EncodingSpecify the text file encoding to use. Leave blank to use the default encoding on your system. To use Unicode specify UTF-8 or UTF-16. On first use, Spoon will search your system for available encodings.
CompressionAllows you to specify the type of compression, .zip or .gzip to use when compressing the output. 
Note: Currently, only one file is placed in a single archive.
Right pad fieldsSelect if you want to add spaces to the end of the fields (or remove characters at the end) until they have the specified length.
Fast data dump (no formattingSelect if you want to improve the performance when dumping large amounts of data to a text file by not including any formatting information.
Split every ... rowsIf this number N is larger than zero, split the resulting text-file into multiple parts of N rows.
Add Ending line of fileSelect if you want to specify an alternate ending row to the output file.
Fields tabThe fields tab is where you define properties for the fields being exported.
NameThe name of the field.
TypeType of the field can be either String, Date or Number.
FormatThe format mask to convert with. See Number Formats for a complete description of format symbols.
LengthThe length option depends on the field type follows:
  • Number - Total number of significant figures in a number
  • String - total length of string
  • Date - length of printed output of the string (e.g. 4 only gives back year)
PrecisionThe precision option depends on the field type as follows:
  • Number - Number of floating point digits
  • String - unused
  • Date - unuse
CurrencySymbol used to represent currencies like $10,000.00 or E5.000,00
DecimalA decimal point can be a "." (10,000.00) or "," (5.000,00)
GroupA grouping can be a "," (10,000.00) or "." (5.000,00)
Trim typeThe trimming method to apply on the string. Trimming only works when there is no field length given.
NullIf the value of the field is null, insert this string into the textfile.
GetClick to retrieve the list of fields from the input fields stream(s)
Minimal widthAlter the options in the fields tab in such a way that the resulting width of lines in the text file is minimal. So instead of save 0000001, we write 1, etc. String fields will no longer be padded to their specified length.

Insert\Update

The Insert/Update step first looks up a row in a table using one or more lookup keys. If the row can't be found, it inserts the row. If it can be found and the fields to update are the same, nothing is done. If they are not all the same, the row in the table is updated.

Note: If you have multiple rows with the same keys that match, only the first row found is compared. This may lead to different results, depending on if the found row matches with given values or not. The update scenario looks like this: If a difference is found in the case of multiple rows with the same key, an UPDATE statement is fired against the database that updates all rows with the matching keys. This note also applies to the Update step.

 Field and description

FieldDescription
Step nameName of the step; this name has to be unique in a single transformation.
ConnectionSpecify the database connection to which data is written.
Target schemaSpecify the name of the Schema for the table to which data is written. This is important for data sources that allow for table names with periods in them.
Target tableName of the table in which you want to do the insert or update.
Commit sizeThe number of rows to change (insert / update) before running a commit.
Don't perform any updatesSelect if you want the values in the database to be never updated and only inserts are performed.
Key Lookup table

Allows you to specify a list of field values and comparators. You can use the following comparators: =, <>, <, <=, >, >=, LIKE, BETWEEN, IS NULL, IS NOT NULL

 Note: Click Get fields to retrieve a list of fields from the input stream(s).

Update Fields

Allows you to specify all fields in the table you want to insert/update including the keys. Avoid updates on certain fields by specifying N in the update column.

 Note: Click Get Update fields to retrieve a list of update fields from the input stream(s).

SQL buttonClick SQL to generate the SQL to create the table and indexes for correct operation.

XML Output

This XML Output step allows you to write rows from any source to one or more XML files.

 Field and description

FieldDescription
File tabThe file tab is where you set general properties for the XML output file format
Step nameName of the step; this name has to be unique in a single transformation.
FilenameSpecify the file name and location of the output text file
Do not create file at startSelect if you don't want to create the file until the end of the step.
ExtensionAdds a period and the extension to the end of the file name (.xml)
Include stepnr in filenameIf you run the step in multiple copies (see also Launching Several Copies of a step), the copy number is included in the file name, before the extension 
Include date in filenameSelect if you want to include the system date in the file name
Include time in filenameSelect if you want to include the system date in the file name
Specify Date time formatSelect if you want the filename to include a date and time stamp that follows the selection you choose from the drop-down box. Selecting this option disables the previous two options.
Add filenames to resultSelect if you want to use the Filename field in constructing the result filename. If deselected, the Filename field is ignored.
Content tab
ZippedSelect if you want the XML file to be stored in a ZIP archive.
EncodingThe encoding to use. This encoding is specified in the header of the XML file.
Parent XML elementThe name of the root element in the XML document.
Row XML elementThe name of the row element to use in the XML document.
Split every ... rows.

The maximum number of rows of data to put in a single XML file before another is created.

Omit null values from XML Output

Select this option to exclude elements containing null values from the output XML file.

Clear this option to include elements containing null values in the output XML file.

You can specify another value to replace null with the Null field in the Fields tab.

Fields tab
FieldnameThe name of the field.
ElementnameThe name of the element in the XML file to use. Type: Type of the field can be either String, Date, or Number.
TypeType of the field can be either String, Date, or Number. Format mask from which to convert.
LengthThe length option depends on the field type follows:
  • Number - Total number of significant figures in a number
  • String - total length of string
  • Date - length of printed output of the string (e.g. 4 only gives back year)

Note: the output string is padded to this length if it is specified.

PrecisionThe precision option depends on the field type as follows:
  • Number - Number of floating point digits
  • String - unused
  • Date - unused
CurrencySymbol used to represent currencies such as $10,000.00 or E5.000,00.
DecimalA decimal point can be a "." (10,000.00) or "," (5.000,00).
GroupA grouping can be a "," (10,000.00) or "." (5.000,00).
NullIf the value of the field is null, insert this string into the textfile.
Get fieldsClick to retrieve the list of fields from the input stream(s).
Minimal widthAlter the options in the fields tab in such a way that the resulting width of lines in the text file is minimal; for example instead of save 0000001, "1" is written, and so on. String fields are not padded to their specified length.

Excel Output

The Excel Output step allows you to write data to one or more Excel files.

 Field and description

FieldDescription
File tab
FilenameThe name of the spreadsheet file you are reading from.
Do not create file at startSelect if you don't want to create the file until the end of the step.
ExtensionThe three-letter file extension to append to the file name.
Include stepnr in filenameIf you run the step in multiple copies (launching several copies of a step), the copy number is included in the file name, before the extension.
Include date in filenameSelect if you want to include the system date in the filename
Include time in filenameSelect if you want to include the system time (24-hour format) in the filename
Specify Date time formatSelect if you want the filename will to include a date and time stamp that follows the selection you choose from the drop-down box. Selecting this option disables the previous two options.
Show file name(s)Displays a list of the files that will be generated. This is a simulation and depends on the number of rows that will go into each file.
Add filenames to resultSelect if you want to use the Filename field in constructing the result filename. If deselected, the Filename field is ignored.
Content tab
AppendSelect if you want to append lines to the end of the specified file. If the file does not exist, a new one will be created.
HeaderSelect if you want a header to appear before the spreadsheet grid data.
FooterSelect if you want a footer to appear after the spreadsheet grid data.
EncodingSpecifies the text file encoding to use. Leave blank to use the default encoding on your system. To use Unicode, specify UTF-8 or UTF-16. On first use, Spoon searches your system for available encodings and populates this list accordingly.
Split every ... rowsSpecify after how many rows, start a new spreadsheet file to continue data output.
Sheet nameSpecifies the name of the worksheet within the spreadsheet file.
Product sheet?Select if you want to enable password protection on the worksheet. You must also specify a password in the Password field.
Auto size columnsSelect if you want to, automatically size the worksheet columns to the largest value.
Retain NULL valuesSelect if you want to preserve NULL values in the output. If deselected, NULLs are replaced with empty strings.
Use TemplateSelect if you want PDI to use the specified Excel template to create the output file. The template must be specified in the Excel template field.
Append to ExcelSelect if you want append output to the specified Excel template.
Fields tab
NameThe name of the field.
TypeThe field's data type; String, Date or Number.
FormatThe format mask (number type).

Update

Update is similar to the Insert/Update step except that updates only are performed. No inserts are performed.

 Field and description

FieldDescription
Step nameName of the step; this name has to be unique in a single transformation.
ConnectionSpecify the database connection to which data is written
Target schemaSpecify the name of the Schema for the table to which data is written. This is important for data sources that allow for table names with periods in them.
Target tableName of the table in which you want to do the insert or update.
Commit sizeSpecify the number of rows to change (insert / update) before running a commit.
Skip lookup

Select this option to update the target database without comparing values between target and source datasets.

Ignore lookup failureSelect this option to ignore failures in the step and process the data from the record stream (source dataset).
Key Lookup table

Allows you to specify a list of field values and comparators. You can use the following comparators: =, <>, <, <=, >, >=, LIKE, BETWEEN, IS NULL, IS NOT NULL

 Note: Click Get fields to retrieve a list of fields from the input stream(s).

Update Fields

Allows you to specify all fields in the table you want to update including the keys. Avoid updates on certain fields by specifying N in the update column.

 Note: Click Get Update fields to retrieve a list of update fields from the input stream(s).

SQL buttonClick SQL to generate the SQL to create the table and indexes for correct operation.


AR Upsert


AR Upsert is a BMC provided step.The Checksum mechanism in the AR Upsert step compares the Checksum value of the source record and the target record to determine whether or not the source record needs to be imported. 

Field and description

FieldDescription
General tab
ConnectionClick the New button to create a new AR System server connection. Click the Edit button to edit an existing connection. Select an existing connection from the drop down menu.Click Wizard for a step-by-step guided procedure to create a new connection.
Form NameName of the AR System form where you output the data. This also works as a staging form when you provide the Alternate Target Form Name.
Use Checksum CheckboxPerforms checksum comparison in update mode.
Alternate Target Form NameSpecify when the target form is different from the staging form. When you provide the Alternate Target Form Name in an AR Upsert step, the operations are performed on the staging formWhen you update any record, an entry is inserted into the staging form.
Batch Commit Size

(Optional) When you specify a Batch Commit Size, the bulk API import the data. You can specify the Batch Commit Size as variable. When you specify the Batch Commit Size as a variable, the AR Upsert step uses the actual value provided.

Best Practice:

When the source data has large number of duplicate records, BMC recommends either of the following:

  • Use step that has the filtering capability, such as Filtering unique records, which filters the duplicate records before calling AR Upsert step or AROutput step.

 

  • Keep the batch size smaller that can be compared with the unique records.
Fallback to single row commit if batch commit fails CheckboxUse this option to switch to a single-row commit if batch-commit fails.
Insert OnlyCreates a new record if an entry does not exist in AR System form. Skips the record, if an entry already exists in an AR System form. The Insert Only mode ignores the Use Checksum flag and adds a new entry.
Update RecordPerforms an update as well as insert operation, based on the Checksum conditions.
Match By Request IDSelect to match output rows by Request ID. When you select the Match by Request ID checkbox, a new entry is created when the Request ID is not provided for the input row.
Multi Match OptionSelect when more than one record can matche the given qualification. 
Configure Matching QualificationSelect when more than one record can matche the given qualification. 


Transform

Add Constants

Adds constant values to the specified field in the destination data source. For example, Calbro Services can add Disk drive as the Short Description value when transferring disc drive information from their SQL table to the BMC_Disc Drive CI class.

 Field and description

FieldDescription
Step nameName of the step; this name has to be unique in a single transformation.
FieldsSpecify the name, type, and value in the form of a string. Then, specify the formats to convert the value into the chosen data type.

Select Values

This Select values step is useful for selecting, renaming, changing data types and configuring the length and precision of the fields on the stream. These operations are organized into different categories:

  • Select & Alter - Specify the exact order and name in which the fields have to be placed in the output rows
  • Remove - Specify the fields that have to be removed from the output rows
  • Meta-data - Change the name, type, length and precision (the meta-data) of one or more field

 Field and description

FieldDescription
Select & Alter tab
Step nameName of the step; this name has to be unique in a single transformation.
FieldsAllows you to rename a field and specify the length and precision.
Get fields to selectClick to insert fields from all input steams to the step.
Edit MappingClick to open a mapping dialog to easily define multiple mappings between source and target fields. 

Note: Works if there is only one target output step.
Include unspecified fields, ordered by nameSelect if you want to implicitly select all other fields from the input stream(s) that are not explicitly selected in the Fields section
Remove tabThis tab allows you to enter the fields that you want removed from the stream. You can also click Get fields to remove to add all fields from the input stream(s). This makes it easier if you are trying to remove several fields. After getting all fields, delete any of the fields that you do not want remove from the stream.
Field name
Get fields to removeClick to add all fields from the input stream(s).
Meta-data tab

This tab allows you to rename, change data types, and change the length and precision of fields coming into the Select Values step. Click Get fields to import fields from previous steps.

A lot of data type conversions are also possible with this tab.

Fields to alter the meta data for

Rename input fields, convert them to different data types, and alter the field length and precision.

Get fields to changeNot applicable

Sort Rows

The Sort rows step sorts rows based on the fields you specify and on whether they should be sorted in ascending or descending order.

Notes:

  • Kettle has to sort rows using temporary files when the number of rows exceeds the specified sort size (default 1 million rows). When you get an out of memory exception (OOME), you need to lower this size of change your available memory.
  • When you use multiple copies of the step in parallel (on the local JVM with "Change number of copies to start" or in a clustered environment using Carte) each of the sorted blocks need to be merged together to ensure the proper sort sequence. This can be done, be adding the Sorted Merge step afterwards (on the local JVM without multiple copies to start or in the cluster on the master).

 Field and description

FieldDescription
Step nameName of the step; this name has to be unique in a single transformation.
Sort directoryThe directory in which the temporary files are stored in case when needed; the default is the standard temporary directory for the system
TMP-file prefixChoose an easily recognized prefix so you can identify the files when they show up in the temp directory.
Sort size (rows in memory)Specify the number of rows to be allotted in database memory for sorting tasks.
Free memory threshold (in %)

If the sort algorithm finds that it has less available free memory than the indicated number, it will start to page data to disk.

Note: This is not exact, because:

  • This is checked every 1000 rows. Depending on the row size and other steps within complex transformations this could still lead to an OutOfMemoryError.
  • In a Java Virtual Machine it's not possible to know the exact amount of free memory. As such BMC recommends you don't use this for very complex transformations with other steps and processes that use up a lot of memory.
Compress TMP FilesSelect if you want to compress temporary files when they are needed to complete the sort.
Only pass unique rows?Select if you want to pass unique rows only to the output stream(s).
Fields tableSpecify the fields and direction (ascending/descending) to sort. You can specify whether to perform a case sensitive sort (optional)
Get FieldsClick to retrieve a list of all fields coming in on the stream(s).

Calculator

Provides you with predefined functions that can be executed on input field values. For example, Calbro Services can calculate the total cost of the computer systems in all their sites and add that information to BMC Helix CMDB.

For more information, see the  Using the Calculator step video. This video is recorded using the earlier version of BMC Atrium Core and is valid for BMC Helix CMDB 9.1.

Scripting

Modified Java
Script Value 

Provides an expression-based user interface for building JavaScript expressions. Allows you to create multiple scripts for each step. For example, Calbro Services can create an if/then/else statement to identify the operating system of a computer based on its serial number. The script can also include the concatenate function, which creates a work group by appending the virtual machine type to the operating system.

 Field and description

FieldDescription
Step nameName of the step; this name has to be unique in a single transformation.
Java script functions

This section provides a tree view of your available scripts, functions, input fields and output fields.

  • Transformation Scripts: displays a list of scripts you have created in this step
  • Transformation Constants: a list of pre-defined, static constants including SKIP_TRANSFORMATION, ERROR_TRANSFORMATION, and CONTINUE_TRANSFORMATION
  • Transformation Functions: contains a variety of String, Numeric, Date, Logic and specialized functions you can use to create your script. To add a function to your script, simply double-click on the function or drag it to the location in your script that you wish to insert it.
  • Input Fields: a list of inputs coming into the step. Double-click or use drag and drop to insert the field into your script.
  • Output Fields: a list of outputs for the step.
Java ScriptThis section is where you edit the script for this step. You can insert functions, constants, input fields, etc. from the tree control on the left by double-clicking on the node you wish to insert or by dragging the object onto the Java Script panel.
FieldsThe Fields table contains a list of variables from your script including the ability to add metadata like a descriptive name.
Get VariablesClick to retrieve a list of variables from your script.
Test scriptClick to test the syntax of your script.
Flow

Filter rows

Allows you to filter rows based on conditions and comparisons. After this step is connected to a previous step, you can click the "<field>", "=" and "<value>" areas to construct a condition.

 Field and description

FieldDescription
Step nameName of the step; this name has to be unique in a single transformation.
Send 'true' data to stepThe rows for which the condition specified is true are sent to this step.
Send 'false' data to stepThe rows for which the condition specified are false are sent to this step
The ConditionClick NOT to negate the condition. 
Click <Field> to select from a list of fields from the input stream(s) to build your condition(s). 
Click <value> button to enter a specific value into your condition(s). 
To delete a condition, right-click and select Delete Condition.
Add ConditionClick to add a condition

Switch/Case

Implements the Switch/Case statement found in popular programming languages like Java. Pentaho routes rows of data to one or more target steps based on the value encountered in a certain field.

 Field and description

FieldDescription
Step nameName of the step; this name has to be unique in a single transformation.
Field name to switchSpecify the field name that contains the value to use as a bases for the row routing. 
Use string contains

Select this option to match any portion of the text available in the fields that are being tested. of the values in 

Clear this option to exactly match the field values.

Case value data type Specify the data type of the values specified in this dialog 
Case value conversion maskSpecify the conversion mask of the values specified in this dialog (numeric / date values)
Case value decimal symbol Specify the decimal symbol of the values specified in this dialog (numeric values)
Case value grouping symbolSpecify the grouping symbol of the values specified in this dialog (numeric values)
Case values Here you can specify a value-target step pair, one per row. 
To specify a null value, simply leave the value column blank while entering a target step. 
Please note that it is possible to specify the same target step more than once.
Default target step Select all the rows that don't match any of the case values above are sent to this target step.

Identify last row in a
stream 


Lookup

CMDBLookUp

This is a BMC provided step. Allows you to look up values in the BMC Helix CMDB based on the primary key that you specified. Lookup values are added as new fields onto the stream.

 Field and description

FieldDescription
Step nameName of the step; this name has to be unique in a single transformation.
ConnectionSpecify the database connection to the lookup table. 
ClassNameDisplay the required CMDB form name
Dataset IDSelect the required DatasetId from a list of IDs that you want to be looked up.
Enable cache?Select if you want to enable caching of database lookups. This means that once a key (or group of key) has been looked up, the looked up values are stored, and returned again the next time this key (or group of key) is being looked up (without incurring the cost of a database call). 
Cache size on rowsSpecify the size of the cache (number of rows).
Load all data from table

Select if you want to pre-load the cache with all the data present in the lookup table.  This can improve performance by avoiding database calls. 
However, if you have a large table, there is a risk of running out of memory. 

Important  

In memory look-ups can lead to different results because of the differences in the way your database compares data. For example, if your database table allows case-insensitive comparisons, you may get different results with this option. The same goes for data with trailing spaces. Those are ignored in certain databases when using certain character data types. 

The key(s) to lookup the value(s)The keys and conditions to perform the database lookup.
Values to return from the lookup tableThe fields from the lookup table to add to the output stream. 
  • New name: allows you to use a different name if the database column name is inappropriate. 
  • Default: is the value returned (instead of null) if the lookup fails. Note that enabling error handling on the Database Lookup step will redirect all failed lookup rows to the error handling step, instead of adding them to the main output steam with null/default values. 
  • Type: is the type of the output field. 
Do not pass rows if lookup failsSelect to avoid passing a row when lookup fails.
Fail on multiple resultsSelect to force the step to fail if the lookup returns multiple results.
Order byIf the lookup query returns multiple results, the ORDER BY clause helps you to select the record to take. For example, ORDER BY allows you to pick the customer with the highest sales volume in a specified state.
Get fieldsClick to fetch all the available fields in this step.
Get lookup fieldsClick to fetch all the lookup fields in this step.

Database Lookup

Looks up values in a database table. Lookup values are added as new fields onto the stream.

 Field and description

FieldDescription
Step nameName of the step; this name has to be unique in a single transformation.
ConnectionSpecify the database connection to the lookup table. 
Lookup SchemaSpecify the database schema containing the lookup table.
Lookup tableName of the database table used for the lookup. 
Enable cache?Select if you want to enable caching of database lookups. This means that once a key (or group of key) has been looked up, the looked up values are stored, and returned again the next time this key (or group of key) is being looked up (without incurring the cost of a database call).
Cache size on rowsSpecify the size of the cache (number of rows). 
Load all data from table

Pre-loads the cache with all the data present in the lookup table.  This can improve performance by avoiding database calls. 
However, if you have a large table, there is a risk of running out of memory. 

Important  

In memory look-ups can lead to different results because of the differences in the way your database compares data. For example, if your database table allows case-insensitive comparisons, you may get different results with this option. The same goes for data with trailing spaces. Those are ignored in certain databases when using certain character data types. 

The key(s) to lookup the value(s)The keys and conditions to perform the database lookup.
Values to return from the lookup tableThe fields from the lookup table to add to the output stream. 
  • New name: allows you to use a different name if the database column name is inappropriate. 
  • Default: is the value returned (instead of null) if the lookup fails. Note that enabling error handling on the Database Lookup step will redirect all failed lookup rows to the error handling step, instead of adding them to the main output steam with null/default values. 
  • Type: is the type of the output field. 
Do not pass rows if lookup failsSelect to avoid passing a row when lookup fails.
Fail on multiple resultsSelect to force the step to fail if the lookup returns multiple results.
Order byIf the lookup query returns multiple results, the ORDER BY clause helps you to select the record to take. For example, ORDER BY allows you to pick the customer with the highest sales volume in a specified state.
Get fieldsClick to return a list of available fields from the input stream(s) of the step.
Get lookup fieldsClick to return a list of available fields from the lookup table that can be added to the step's output stream.
Jobs

Get variables

Allows you to get the value of a variable. This step can return rows or add values to input rows.

Set variables

Allows you to set variables in a job or in the virtual machine. It accepts one (and only one) row of data to set the value of a variable.
Was this page helpful? Yes No Submitting... Thank you

Comments