Transforming historical data (PTF BQY2138 applied)
Use the Data Transformation utility, BBM9MD75, to unpack and filter the record entries (REs) in the historical data extracted by BBM9MD73 (History Data Extraction utility). BBM9MD75 also prepares selected subsets of the historical data by transforming the format of each field to a format that is easily ingested by off-mainframe data analytics tools, such as Excel or Splunk. The internal binary data in REs is transformed into comma-separated values (CSV) format. Additionally, instead of specifying field offsets and lengths using the SORT utility you can specify row and column filtering in SQL select-like statements by field names, so only the desired data is extracted and formatted. This reduces the time to transfer the data off-mainframe and reduces the cost of data ingestion for tools that charge by the number of bytes of data read.
The SORT utility provides the best performance for data filtering. As such, for large historical data extracts, you may want to first filter records using the SORT utility and then pass the reduced historical data extract as input for the data transformation utility for final filtering and data transformation.
You can run the BBM9MD75 utility as a batch utility. You can insert it as a job step immediately after BBM9MD73 and/or the system SORT utility (if the RE fields have not been rearranged), or in a separate standalone job after you run BBM9MD73.
2.6.6 Processing of Missing Fields in Record Entries
The BBM9MD75 utility assumes product defined REs match the format described by the product data maps. However, REs in history data sets may come from data recorded prior to updates to the data maps delivered by product maintenance PTFs. By convention, new fields are added to previously zero-filled reserve fields or added to the end of REs. This means that REs may be shorter than expected when pre-PTF history records are processed. The data transformation utility detects when fields are extracted beyond the end of REs and replace these fields with either zeros or blanks depending on the field type. REs may also include a version number field which allows customers to detect when these changes occur and detect when new fields or changes to fields have occurred.
Prerequisites
BBM9MD75 runs as a compiled REXX load module. The IBM Compiler and Library for REXX on IBM Z product is required to run the utility. BMC Recmmends that you install the REXX library that accompanies the compiler in the Link Pack Area.
If you do not have the REXX compiler and library installed, you can run the BBM9MD75 utility with the REXX Alternate Library which is packaged with z/OS. If you use the Alternate Library the utility will run in interpreted mode, which runs less efficiently than running in compiled mode. IBM recommends that you install the Alternate Library in the LINKLIST.
Refer to the z/OS REXX Compiler and Library and the z/OS REXX Alternate Library documentation for further details.
Sample JCL for running BBM9MD75
The following is a sample JCL for running the BBM9MD75 utility:
//DTRANS01 JOB ...
//DUMPXTRT EXEC PGM=IKJEFT01,REGION=0M,PARM=’BBM9MD75’
//STEPLIB DD DSN=prefix..BBLINKLIB,DISP=SHR
//MD73IN DD DSN=BMC.SYSA.MVMVS.HISTDSN(+0),DISP=SHR
// DD DSN=BMC.SYSA.MVMVS.HISTDSN(+1),DISP=SHR
//BBACTDEF DD DSN=prefix..BBACTDEF,DISP=SHR
//BBSAMP DD DSN=prefix..BBSAMP,DISP=SHR
//BBMAP DD DSN=prefix..BBMAP,DISP=SHR
//SYSTSPRT DD SYSOUT=*
//SYSTSIN DD DUMMY
//SYSIN DD *
*
* SELECT 1: Address Space Record Map (ASRE)
*
SELECT ASREASID,ASRENAME,ASREJBID,ASREYFL,
ASRSLAP,ASRSDUR,ASRSTCB,ASRSSRB,ASRSFMC,ASRSWSS,
ASRILAP,ASRIDUR,ASRITCB,ASRISRB,ASRIFMC,ASRIWSS
FROM BBDTDM.BBMUCE01
WHERE (ASREYFL=’01’X | ASREYFL=’02’X)
& LCLTIME > “20190101000101.00”;
*
* SELECT 2: Mounted File System Record Map (FMRE)
*
SELECT *
FROM BBUTDM.BBUUCEC2;
//OUT1 DD DSN=prefix..MVMVS.CE01,DISP=(,CATLG),
// SPACE=(CYL,(500,100),RLSE),UNIT=SYSDA,
// DCB=(LRECL=32000,BLKSIZE=32760,RECFM=VB,DSORG=PS)
//OUT2 DD DSN=prefix..MVMVS.CEC2,DISP=(,CATLG),
// SPACE=(CYL,(500,100),RLSE),UNIT=SYSDA,
// DCB=(LRECL=32000,BLKSIZE=32760,RECFM=VB,DSORG=PS)
//
Note
If you are transforming data from a newer version of MainView Infrastructure that uses BBMAP data sets, add the //BBMAP DD statement in the JCL header as follows:
//BBMAP DD DSN=prefix..BBMAP,DISP=SHR
If there is no //BBMAP DD statement in your data transformation utility JCL and MAPLLQ=BBMAP appears in the TDM member, the utility issues an error message indicating the BBMAP data set cannot be opened and the associated data map member is not found.
In the sample JCL, you extract and transform two RE types from a data set extracted from a PAS in data set BMC.SYSA.MVMVS.HISTDSN(+0).
You specify two SELECT statements in the SYSIN input stream:
- The first SELECT specifies a set of fields from the RE described by the TDM member BBACTDEF(BBDTDM), and the data map in BBSAMP(BBMUCE01) or BBMAP(BBMUCE01), for the Address Space Record Map.
- The second SELECT statement specifies all the fields for the Mounted File System Record Map, BBMUTDM. The BBUUCEC2 RE, is transformed for all rows in the input data set.
- Each SELECT statement writes output to DDNAMEs OUT1 and OUT2.
BBM9MD75 DDNAMEs
The following table describes all of the input/output DDNAMEs for the BBM9MD75 utility:
DDNAME | Description |
---|---|
STEPLIB | MainView Infrastructure load library |
MD73IN | Extracted history output from MainView History Extraction utility, BBM9MD73 |
SYSTSPRT | Output messages and trace data issued by BBM9MD75 |
SYSTSIN | Ignored (dummy) |
SYSIN | Input SELECT statements specifying desired columns and rows selection criteria You can specify one or more SELECT statements. Each SELECT statement writes formatted data to output data sets with DDNAMEs OUTn . The output for the first SELECT statement is written to DDNAME OUT1. |
OUTn | Transformed output data from SELECT statement n Multiple SELECT statements require an equal number of OUTn DD statements. DCB attributes recommendations for this data set are as follows:
Any LRECL and BLKSIZE values that can accommodate the largest output record are acceptable if RECFM is set to VB (variable blocked with 4-byte RDW). |
BBACTDEF | The data set that contains the MainView product table data map (TDM) members |
BBSAMP | The data set that contains the MainView product data map members |
BBMAP | The data set that contains the data map members for new versions of MainView products. |
Condition Codes
BBM9MD75 sets the following condition codes so you can test the outcome of the data transformation and, if necessary, take appropriate action:
Condition Code | Description |
---|---|
0 | Successful execution of all SELECT statements read from SYSIN. |
4 | Warning condition All of the requested data may not have been filtered and transformed. |
8 | One or more SELECT statements failed. Validation errors have been issued in SYSTSPRT. |
>8 | A severe error condition occurred, causing the data transformation to be terminated. See error messages in SYSTSPRT and correct the error condition before resubmitting the data transformation job. |
SELECT statements
The SYSIN input stream contains one or more SQL-like SELECT statements, describing which columns and rows to filter from for a specific MainView product record. Any input lines that begin with an asterisk (*) in column one are ignored as a comment lines.
The field names included in the output for each SELECT statement are listed immediately after the SELECT keyword. The special value of asterisk (*) indicates to include all fields defined in the history record data map, in the order they are listed in the associated data map for the RE type.
You specify the target RE type in the FROM clause, which names the member in the BBACTDEF library that contains the MainView product TDM, and the library member name that contains the record data map.
Use the optional WHERE clause to specify the criteria under which the RE rows are selected. You can compare any valid REXX expression referencing field names in the specified data map in the FROM clause to any literal value.
You can include the special field names LCLTIME, UTCTIME, ORECRTIN, ORECOTYP, ORECISID, and ORECSYS in the expression. These special fields contain the following values originating from the history record OREC header:
Field Name | Field Contents |
---|---|
LCLTIME | Local time in YYMMDDHHMMSS.th character format, derived from the ORECTIME field Note The time conversion uses the z/OS CVTLSO field. This means that the leap-second offset value for the system where the BBM9MD75 utility runs must match the value set for the system where the history data is recorded. |
UTCTIME | UTC time in YMMDDHHMMSS.th character format |
ORECSYS | 4 -byte character string containing the SMFID of the system where the history record was recorded Example: |
All SELECT statements must end with a semi-colon (;) or the end of the input stream. If you use a semi-colon, it must be the last character on the input line.
SELECT statement syntax diagram
Any relational expression that REXX evaluates as a zero or one value is accepted by the WHERE clause, including functions such as SUBSTR.
Example of literals
number |
|
quotedString |
|
hexString | ‘0C4’x |
binaryString | ‘11101100’b |
Note
Incorrectly formatted literals cause syntax errors in SELECT statement.
CSV output format
The utility writes output for each SELECT statement to the respective OUTn where n is the SELECT statement number. Each record written to the OUTn DD contains columns and rows in display code format.
The first row in each output data set contains the names of the columns for the rows to follow. The second and subsequent rows contain the common fields from the OREC record header, followed by the requested columns from the REs that you specified in the FROM clause. Each row contains fields from a single RE, even if the original history extract data set was packed, and the OREC header information is repeated for each row associated with a logical record collected during an interval.
All values are separated by commas. Character fields enclosed in double quotation marks (“…”).
The common fields that precede all RE fields are listed and described in the following table:
Field Name | Format | Description |
---|---|---|
UTCTIME | Numeric | End date and time of the interval in YYYYMMDDHHMMSS.th format, represented as UTC time |
TZOFFSET | Signed Integer | Local time zone offset in minutes from UTC |
LSOFFSET | Signed Integer | Local time leap seconds offset in seconds from UTC This value is determined by the CVTLSO field of the CVT control block in the z/OS system where the data transformation utility is run. It is assumed to be set to the same value as the CVTLSO value on the z/OS system where the history data is recorded. |
ORECRTIN | 1-byte Character (hexadecimal) | Record Entry Type ID Number for the interval Characters are enclosed in quotation marks. In CSV output, this field is shown in “0xhh” format. |
ORECSYS | 4-byte Character | z/OS SMFID of the system where the history was recorded. |
ORECISID | 1-byte Character (hexadecimal) | Interval Recorder Set ID In CSV output, this field is shown in “0xhh” format.
|
ORECOTYP | 2-byte Character (hexadecimal) | MainView PAS type In CSV output, this field is shown in “0xhhhh” format. |
The fields you specified in the SELECT statement follow the common fields listed in the previous table. Each field is either a numeric value or a character value surrounded by quotation marks. Fields are separated by commas.
The following table describes how field types used in data maps are translated to the output CSV format:
Field Type | Description | Output CSV Format |
---|---|---|
B | Binary number | Unsigned Integer The number of digits depends on the length of the binary field. |
BI | Binary number | Unsigned integer Internally scaled for K, M, G, T, P, etc. |
Bn | Binary number | Unsigned Fixed-point number with n places to the right of the decimal point |
C | Character | Characters enclosed in quotation marks If the field contains embedded quotation marks, they are doubled to escape the outer enclosing quotation marks. |
F | Floating point | Exponential notation number in +/-n.nnnnnnnnnnE+/-mm format The number of digits in the mantissa depends on the length of the field. |
G | Flag bytes | Character enclosed in quotes If the field contains one or more embedded quotes, each quote is doubled to escape the outer enclosing quotes. |
H | MZZ_ZNTS time duration | Fixed point number in HHMMSS.th format |
I | IP address | IPv4 or IPv6 format enclosed in quotation marks |
L | STCK (64-bit clock word) | Fixed point number in sssssss.thtthm format (seconds) |
P0 | Unsigned packed decimal | Unsigned Integer The number of digits depends on the length of the packed decimal field. |
P1 | Signed packed decimal | Signed Integer The number of digits depends on the length of the packed decimal field. |
Q | Packed date | Integer in YYYYMMDD format |
QU | Packed UTC date | Integer in YYYYMMDD format |
R | Packed time | Signed fixed-point number in HHMMSS.t format |
RU | Packed UTC time | Signed fixed-point number in HHMMSS.t format |
S | STCK (64 bit clock word) | Fixed-point number in YYYYMMDDHHMMSS.thtthm format, representing the UTC date and time (PTF BQY2314 applied) This data type is translated to zero if the extracted history field contains a zero date/time value. A zero date/time value indicates that the date and time value was not available at the time history data was recorded. |
T (PTF BQY2254 applied) | Binary number | Signed Integer The number of digits depends on the length of the binary field. |
TI | Binary number | Signed Integer |
Tn (PTF BQY2254 applied) | Binary number | Signed Fixed point number with n places to the right of the decimal point |
U | POSIX time of day | Fixed-point number in YYYYMMDDHHMMSS.tht format representing the UTC time The resolution of the formatted time depends on the length of this field:
(PTF BQY2314 applied) This data type is translated to zero if the extracted history field contains a zero date/time value. A zero date/time value indicates that the date and time value was not available at the time history data was recorded. |
X | Hexadecimal string | Character output in “0xHHHHHHHH…” format, where H is a hexadecimal digit from the set {01234567890ABCDEF} Note Sorting by the translated output field is possible with ASCII, not with EBCDIC. |
Z | MZZ_ZNTS date and time | Fixed-point number in YYYYMMDDHHMMSS.th format, representing UTC date and time (PTF BQY2314 applied) This data type is translated to zero if the extracted history field contains a zero date/time value. A zero date/time value indicates that the date and time value was not available at the time history data was recorded. |
Comments
Log in or register to comment.