Unit Testing Best Practices

Automating mainframe unit testing is business critical. But best practices for automated unit testing are not always apparent or well understood, as true automated unit testing on the mainframe was unavailable until the release of BMC AMI DevX Total Test. Before we get into best practices for carrying out this method of testing, here are some things to consider.

For more information about effective mainframe testing, see Ten Steps To Effective Mainframe Testing.

What is Unit Testing?

Unit testing is the earliest point in development at which you can remove identified flaws and ensure program logic works. Failures are removed when developers can easily recall what they coded, making issues easier to resolve. Ultimately, the value of unit testing is that you to trap bugs within development, thereby minimizing the negative consequences of failures that otherwise would be externalized.

Mainframe applications are composed of many programs/sub-programs, sub-system services for databases, and stored procedures, file access and communications. Currently, Total Test treats a program, sub-program and stored procedure as the units that can be tested. An additional level of verification can be done on data that is written from a program into QSAM, VSAM, Db2 (Insert, Update, Delete), IMS (ISRT, REPL, DLET) or many CICS Commands (i.e., WRITEQ and many others).

However, mainframe unit testing has historically been a practice of manual processes that are laborious and time-consuming—to the extent that developers will cut short or even forgo unit testing to ensure other priorities are accomplished. The consequence of neglecting thorough unit testing is lower code quality throughout the delivery pipeline, which drives down efficiency in future testing due to extant low-level bugs. This reduces delivery velocity and quality.

Why automate Unit Testing?

Automated unit tests verify that a program works and continues to work as the developer intended for it to work. Without automation, setting up test environments takes a long time, and testing teams often serve more than one development team. Thus, changes by an individual developer may have to wait in a “queue” for quite some time before they know if their changes passed or failed their tests.

Due to these time constraints, if developers don’t accomplish unit testing, root causes of failure are much harder to identify in later stages of the application life cycle—any of the components that have been changed since the earliest stages of development could have caused the failures.

With automation, developers can quickly run tests independently after every code change so they get feedback immediately while they’re still familiar with the changes. As new changes introduce new failures, identification of the cause and location of each issue is made quick and simple through automated unit testing.

Once a developer’s changes pass all unit tests, they can be sufficiently sure their changes fulfill any new requirements and don’t break existing functionality. With this knowledge, they can start working on new requirements or new projects without as much fear of being interrupted later to revisit changes and resolve issues that should have been identified earlier.

In summary, automating unit testing not only reduces the time it takes to identify and resolve issues, thereby improving developer productivity; it also improves focus on tasks by decreasing the likelihood of interruptions, further improving productivity.

Better management and control of software projects

Automated unit testing improves managerial duties for product managers and functional managers, both of who are concerned with daily developer productivity, in the following ways:

Project tractability: Automated unit testing exposes failures and incomplete implementations so they can be reported early in the life cycle and addressed before accumulating and causing project delays.
Project manageability: Failures identified in automated unit testing can be classified once the cause and its type and location have been determined. When bugs are better classified, they can be prioritized to eliminate the worst problems first.
Insurance against regressions: Automated unit tests can verify that developers’ new changes don’t break existing code. These tests virtually eliminate accidental or unintended side effects.
Efficiency and speed: Automated unit tests are significantly faster to run than manual tests. They require just a few people to review results and identify test failures, which speeds time to error resolution because it’s faster and easier to assign the right person to fix the failure. Test setup can also be very time consuming and error prone. Automated test setup and tear-down provide additional time savings over manual testing and minimize the risk of a bad test setup.
Planning and predictability: Applications are complex systems made from individual components that must be reliable to ensure the overall system works. Unit testing verifies the component making up the larger system. Automating unit testing is the only way to effectively and efficiently ensure the reliability of those individual components.
Customer satisfaction: Automated unit testing can improve customer satisfaction by preventing copious errors from escaping and by providing a baseline assurance that the product works when shipped. Confidence is much higher with good automated unit tests.

Developer driven unit tests

Best practices of unit testing during the development process are as follows:

Unit tests are created and executed during development, either manually by developers or through automation with a tool like Total Test. Functional/integration/system tests are usually created and executed by a test organization. These tests can also be automated with Total Test’s functional testing capabilities.
Unit testing verifies a single program or a single part of a program. Functional/integration/system tests verify several components of the system work together. Unit tests try to test a component in isolation from the rest of the system. They may use mocking/stubbing to eliminate the need for other system components. This allows unit tests to focus on a specific component. Often, this helps the tests run more consistently by eliminating component behavior, which varies, for example, based upon the date of execution.
Unit tests should be run after every build, which means they are run much more frequently than functional/integration/system tests. Developers will run unit test suites after making updates to ensure their most recent fixes haven't broken existing code. Functional/integration/system tests are scheduled for a team and run at scheduled times.
Unit testing provides information about the specifics of a test failure, type of failure and location. Functional/integration/system tests usually require more development investigation to find the cause of a failure.
You will have many unit tests but fewer functional/integration/system tests for a single program, partly because many will be executed manually and partly because each functional/integration/system test encompasses more features in a test case.
With a tool like Total Test, unit tests will have automated test setup and tear-down, so the test can be run repeatedly whenever. Automatic setup and tear down can also be done for functional tests by Total Test. Without Total Test, a test engineer is generally required to set up test systems and test data.

Improve Code Coverage

Verify how much ground your existing manual test cases cover with a code coverage tool that provides visibility into the blocks of code the application is executing. This review helps you understand if you have a good set of test data driving your application testing. Test cases may have limited coverage because you have limited data but adding more data won't help; you need specific data to drive execution through different logic branches.

Error path testing, or negative testing, is something you should strive to include in your automated testing. Code coverage identifies the gaps in your testing and helps you determine if you need additional test cases, thereby providing direction on what to test next. Code coverage generally will not hit error-handling blocks of code unless you specifically provide data to trigger the error handling.

If you have important error handling in your programs it should be tested. Creating purposefully wrong data should be used to drive the program execution into error-handling pathways. You need to carefully define what is a successful test versus a successful execution. A successful test verifies that the program error handling executes correctly and that the wrong data is identified and handled without the program crashing, or worse silently failing.

Schedule Unit Tests

You should rerun your unit tests whenever a component within the application/program is rebuilt. You want to run the tests before developers forget changes they’ve made. Relearning the code to fix a problem later is very expensive. We recommend a few options when rerunning unit tests in an automated fashion:

After each compile/build of the program under test
When developers check in their changes to your SCM, so they get immediate feedback
On a daily/nightly basis so developers get daily feedback before they forget their changes

Rerun test suites

If you can determine the items below, you can limit rerunning tests to specific cases:

All dependent software related to an updated program
Which tests test all the changed programs
All tests associated with all related programs

Unfortunately, you probably don't know all that information and it would be very rare if all your code and test cases were that well-defined and organized. The cost of running an automated regression unit test is low compared to the cost of finding a bug in the field. We would err on the side of rerunning the entire test suite.

Where to execute Unit Tests

A problem for many mainframe testers is the limited availability of test environments that are fully configured for running tests. Because these systems are scarce resources, you usually have a rigidly scheduled time when they are available for your program testing. Unfortunately, testing time can vary widely, depending on the types of issues found. Often, other test teams will overrun their allocated time and foul up the schedule for subsequent teams. Therefore, the more test system options you have, the more likely you can maintain your project plans.

In Total Test the term unit test isolation refers to ability to stub out data and sub-systems like IMS, Db2 and CICS. Unit test isolation reduces the dependencies for a specific test system. Because you have fewer test system dependencies, your tests have more flexibility to run on a wider range of test systems.

For maximum isolation and flexibility, “test stubs” or "test mocks" should be used whenever possible. This encompasses:

Data passed into a program from calling programs and external data sources
Data leaving a program to be written to external data sources

External sources could mean, for example, VSAM/QSAM datasets, Db2 tables and IMS databases. Instead of accessing the “real” components, unit tests make use of simulation data in the form of stubs—mocks, also called virtualized data and commands. The reduced number of dependencies from using "test doubles" may provide more options where your tests can be executed. Don't just assume you need a complete test environment to run your unit tests. If you’re waiting for a complete test environment to become available, check if a less complete test system would suffice for running your unit tests.

A cost factor to consider when executing tests on an LPAR is the system software running on that LPAR. Even if the program doesn’t use it, you must pay for Db2, IMS, CICS, etc. if they are active on the LPAR. So, you may consider running unit tests that have removed the requirements for subsystems to run them on a really “basic” LPAR and avoid the cost of system software. Ideally customers could set up a non-Db2, non-IMS, non-CICS LPAR for unit test execution.

As a result, unit tests should be able to execute anywhere, at any time, in any order, regardless of the availability of other application components, databases or data within the databases. It’s this independence that allows for executing unit tests at the level of frequency we’ve recommended here.

Creating Unit Tests

The best practices in creating unit tests are described in this section.

Focus on testing a single thing

If a test case attempts to validate more than one item, it requires more work to determine which item in the test failed. It’s much better to have each test case focus on validating a single item. It also makes it easier to diagnose why the test case failed as well as easier to maintain the test case.

Focusing on testing a single item also generally requires less test data for the test case. Given that data must also be maintained for the test case, more data means more data maintenance in addition to code maintenance. The goal of “test one item” can be difficult with COBOL programs that are large (hence the value of modularizing code by splitting logical sets of code into sub-programs).
Total Test makes it easy to copy a unit test and edit data stubs to create new test cases. We recommend selecting the smallest amount of data to execute the item under test. To test another similar item, copy the unit test and make the required change to the test data in the data stub. The goal should be many small tests with small amounts data.

Identify where in the process to unit test

Think about when you want to unit test in the development process and application life cycle. This information will help later when you integrate with DevOps automation processes. Organizations vary in running automated unit tests: after every code commit; after a software build; at a scheduled interval; before starting user acceptance testing; before promoting to production; or just on demand. Your organization should choose when to run automated unit tests based on your specific process.

Ensure Unit Tests pass

You need to get a test project successfully executing before checking it in to a test repository or adding it into your CI/CD system (e.g., Jenkins) or even sharing it with co-workers. Get the happy-path tests working first. Add boundary condition tests to verify high and low values. When using negative tests, set the conditions so that tests of expected error conditions also pass. In total, make sure that 100% of your unit tests pass, and after you have applied changes to your code, make sure that this continues to be the case—before you check in your changes to version control. Total Test allows you to change the assertions used in a test. If needed, you can also remove an assertion that isn’t relevant to a test. For example, a date that changes, causing a test failure, can just be removed if it isn’t important to the test validation.

Unlike in later stages of testing, there is no wiggle room for interpretation of the results. If any unit test fails, it means the program does not work as desired. Of course, this also means that your tests themselves have to be maintained with your application.

Make Unit Tests reliable

You really don't want to track down a failed test case only to discover it was a false failure. A test case needs to fail only when there is a program error and pass when the program works. Get your unit tests working reliably in your development environment before adding them into a test case repository. Don't have your team chasing phantom failures.

Include Code Coverage

You want to understand how good a job your test cases have done to verify your code base. When—and how—do you know that a 100% pass rate is a good indicator of the quality of your code? Only when you can be sure that your tests have executed a sufficiently large portion of your code, and if you can be sure that the critical parts of your program have been executed. Code coverage tools are the best way to identify which parts of your code have been executed during your tests and which haven’t.

Code coverage tools also help you manage the size of your testbed. It does not make sense to execute the same (standard) path several hundred times, while the critical (sometimes exotic error handling) paths don’t get tested at all. Tools for static code analysis can be valuable resources to help you determine:

Different paths through your program
Conditions under which these parts get executed
Estimates for the number of test cases needed

For the latter, the cyclomatic complexity metric—also known as the McCabe metric—provides a lower bound for the number of different test cases needed to reach full code coverage.

On the other hand, a high code coverage percentage is a necessary condition for high-quality test suites, but not always a sufficient condition, e.g., executing a certain path once may not always be enough; or, if there are boundaries to conditions determining if a certain path gets executed, these should also be tested.

While experts like Jez Humble and David Farley stress the importance of a coverage of no less than 80% of your code—for each individual program under test, not on average or in total—we would like to stress that you should also make sure the right 80% gets executed.

Keep Unit Tests independent

A unit test should only test a single item and shouldn't be dependent on the execution of other test cases, so you should eliminate dependencies between them. Some people run many programs in a series because the data from earlier programs feeds data to subsequent programs.

Total Test’s data capture lets you unit test programs without these dependencies because you have previously captured the data for the test. You can also capture or specify the program parameter values so you don't need to re-execute the preceding programs to compute the values required for the program under test.

Test smaller units

It can be very difficult to test a single thing in a large COBOL program. Many people don't use sub-programs to provide logical units to test. If you’re working on a large COBOL program and need to do extensive maintenance, we recommend breaking the program into logical sets of code and grouping them into sub-programs.

Keep your redesign goals in mind. Are you trying make callable services? Are you trying to create microservices? Are you trying to turn a batch program into an online program?

We have seen companies successfully redesign their programs with a common core set of sub-programs and other sets of sub-programs that allow different interfaces to the core set of sub-programs. The different interfaces include a batch interface, web service interface and online CICS interface. This approach reduces the amount of duplicate code and simplifies testing because each sub-program is a testable unit.

Creating related Unit Tests

Once a unit test has been captured with Total Test, it’s often useful to create derivative tests based upon the generated unit test. For example, you capture a unit test and then review the code coverage report, which indicates a block of unexecuted code. You can review the code and see that by changing a single value you could execute the previously unexecuted code.

To do so, make a copy of the test case in Total Test’s test scenario editor, find the read stub where that value is read into the program, copy the input stub with the data value that needs updated and change it. Move the changed stub into your new test case and remove the old stub. You now have a new test case that executes more of the code without having to record a new test case with Code Debug.

Limit data used in Unit Tests

The goal of unit testing is to limit the code you test to a small unit of a program. That way, if a unit test fails, it’s easy to identify the area of a program having an issue. By extension, you should limit the amount of data a unit test uses so that if a program stops processing data correctly it’s easy to identify the data causing the problem. You want just enough data to test a specific item you’re trying to verify—and each unit test should only try to verify one item.

There are at least three major issues with using too much data in unit tests:

Wasted time: Too often, we see unit tests that use much more data than they really need. When the test case fails, it requires more work because you have to sift through not only more code to determine the issue but also more data.
Too much code tested per unit test: The other thing about using too much data in a unit test is it ends up executing more program code than a single unit test should. Sometimes more data used in the unit test means more code that will be utilized to process that data. Again, your goal should be to unit test a single item per test case. By limiting the amount of data in a single test case you also limit the amount of code that will be exercised, bringing you closer to that goal. Another way to think of this is: test a single pathway through the program.
Redundant testing: Software doesn’t require wear testing like sneakers. Running the same or very similar data through the program won’t find new problems. You want unique test cases with unique data to exercise your business rules and program execution. Using more data just creates more work to diagnose and takes more time to execute without any added value.

These points aren’t to say you can’t use a lot of varied data for testing, rather that you should separate the data into small sets for many different unit tests. You want to focus on the unit of code being tested and the unit of data needed to test it. Extra data just increases the amount of time required to diagnose a unit test failure.

How Total Test creates Unit Tests

In contrast to unit testing in distributed development, which is often based on the xUnit framework, Total Test does not use code written by a developer to accomplish tasks. Instead, it allows you to define tests using a graphical interface with forms and table-like editors to collect test data, define assertions and build stubs.

The results are stored in XML format files, which are used by a component called the Test Runner that sets up the data-handling and executes the test. The Test Runner acts as the driver for any program under test and handles feeding test data into the program under test as well as handles the use of stubs and returned results that are used by the comparison done in the test assertions.

Compared to a distributed testing framework like JUnit, this might result slightly in less flexibility when it comes to testing exotic situations. On the other hand, Total Test does not require manual coding in any kind of programming language, thus reducing the potential for making mistakes while coding your test cases.

What’s more, Total Test integrates with Code Debug, a debugger, allowing developers to execute an existing program under the control of the debugger and collect the relevant data during execution. Additionally, you can gather code coverage metrics while creating the test cases.

Total Test projects are created inside the Project Explorer view of the Host Explorer perspective. Test suites are held within a test project. Test suites can hold test scenarios, and test scenarios hold test cases.

To create a test case, you need to know the COBOL structures used in the program and use Code Debug to gather them or import them into structures called “listructures“ in Total Test. You also need to know which structures are passed to and from programs. In Total Test, these items are called interfaces. Code Debug can automatically create these, or you can manually create them in Total Test.

Unit test maintenance

When your Program changes its logic so that it computes new data values or the program is updated to read and process new data values the unit test cases will probably need updating. The test cases include assertions, which have expected values. If your program changes produce different values, your assertions’ expected values also need to change.

How to do Unit Test maintenance

Whether to modify a test case or to use Code Debug to re-record the program data for a test depends on the test scenarios. You can recreate a test scenario by recollecting the data and data structures and interfaces. The result is that everything in the test case is completely up to date with the program. This approach is usually quick and easy unless you have modified the data to execute specific code pathways.

Test scenario/test case updates can be done for Batch Programs using VSAM and QSAM
Test scenario/test case using Db2 stubs, IMS stubs or CICS stubs cannot be updated

You can update test scenarios that contain VSAM and QSAM; you can update the field structures and stubs; and if a COBOL structure is updated, you can import the updated copy book into the program. But you must find all the places in the test projects where that structure is used. This task can be done using the file search in Workspace and specifying the structure name that changed, then updating the references to the old structure to the new structure. This work may mean that you must update some data stubs. If new fields were added, data for those additional fields will need to be added.

There are some limitations in Total Test related to editing Db2 SQL stubs. Currently, you can't edit Db2 stubs to add or remove fields. This is related to some internal structures we create to handle the Db2 preprocessor statements. When host variables (fields) are added or removed from a Db2 SQL call, you need to recollect data for the test case to capture those changes. However, you can also change the values of individual fields, assuming the data can be processed by the program. You can change the data values of the Db2 host variables in the Db2 SQL stubs.

There are similar limitations in Total Test related to editing IMS stubs. You can change field values in the I/O area and in the segment search arguments (SSA); however, you cannot change values in the program communication block (PCB). The PCB information is held in the PCB tab of the IMS Stub Editor. The PCB holds information used by IMS to communicate information between the program and the IMS sub-system.

If you have easy access to an existing test program and you haven't edited the stub data with specific test data, it’s probably faster to just run and recollect the data rather than doing a lot of manual editing. Currently, recollecting is the only choice for Db2 SQL programs and IMS programs.

With programs using VSAM and QSAM files, if there are few fields changed and few records in the stub, it may be relatively quick to manually update the test case. However, if there are many field additions, it’s probably easier to just recollect data for the test case. If you have to manually edit the data, you will need to compare the stubs. You can select both files and right click to use the Compare With option to compare the files and identify your previous changes and the structure changes.

Here are scenarios to help you decide:

Scenario	Solution
How many fields were changed?	Compare fields manually updated versus fields recollected
Is the program still available with the original data?	Recollect or else manually update
Changes to the structure/interfaces/data stubs	If possible, recollect; otherwise manually edit
If just a data calculation change, and if a small number of records	Manually update, otherwise recollect
Records added or removed	If few changes, just manually edit, otherwise recollect
How much time have you spent developing specific data for executing specific logic?	If many changes, collect a new stub and compare the new stub to your existing stub
If a couple COBOL structures changed	you could manually update, otherwise recollect
Is test data available? Is the program still available?	If so just recollect, otherwise manually update

Easy test scenario/test case recreation

To make the process of recreating a unit test scenario, you should keep track of the:

Original version of the program module to generate the unit test scenario/test case
Original data used with the program module to generate the unit test scenario/test case

If any changes were made to the original data do a file compare between the changed data stubs and original data stubs and then update the newly generated data stubs with those changes.

Tracking test history

Turn on the history option in Total Test once you have instituted regular builds with regular unit tests. This option is called “Save time stamped archive files;" it’s a project property. Right click on the root of the test project and select “properties.” This option will keep the test archive results files in the Total Test history folder. It allows you to review the specific changes that occurred between specific test executions.

If you are moving test results and code coverage data into SonarQube, it can provide a graphical view of the data