Portability examples

The following sections provide examples of portability.

Carriage Return Characters (NT)

Carriage return characters pose a portability issue on Windows NT for processing text data. UNIX platforms separate text file lines with a new line character (that is, "\n" in PSL vernacular).

Windows NT uses a pair of new line ("\n") and carriage returns ("\r") characters to separate lines in a text file. This issue poses problems in PSL in a number of areas related to text data:

PSL cat() on a text file
PSL system() or execute() returning text data output by a child process
PSL read() or readln() on a popen() channel returning text data output by a child process

The effect is that the string variables containing text data from these sources will still contain the "\r" characters. This is problematic because the PSL string operation functions do not treat the "\r" characters as special and will only consider the "\n" as the separator.

For example, if you use the PSL function ntharg() to return the first or second line, the returned line will still contain a "\r" character. This character is also difficult to see when debugging the script with debugging output statements.

There are no PSL functions that handle "\r" characters as separators for lines. Some of the places that this poses an issue when processing these strings are as follows:

Line and word processing: ntharg(), nthline(), nthargf(), nthlinef(), PSL foreach() statement
Regular expression search: grep()
List or set operations: union(), subset(), difference()

The solution for this issue is to trim the carriage return character immediately after the input was read.

trim(xx, "\r");

The solution to the "\r" issue is a pragmatic, if not particularly elegant, solution. The PSL trim() function can be used to remove these characters before processing the string data. You have to remember to remove these characters every place where the KM gathers external text data. However, it is only one line in each such place. An example involving PSL cat looks similar to the following:

text = cat(pathname); # Get the text data contents of a file text = trim(text, "\r"); # Remove all rcharacters from the text

One possible method of handling this issue is to wrap all calls to PSL cat(). Wrapping requires calling another wrapper function instead of PSL cat(). each time, as shown in the example below:

function portable cat(fname)
{
local text , save errno;
local text , save errno;
# Get the text data contents of a le
text = cat(pathname);
text = cat(pathname);
# save errno just in case
save errno = errno;
save errno = errno;
# Remove nr characters from the text
text = trim(text, "\r");
text = trim(text, "\r");
# reinstate errno
errno = save errno;
errno = save errno;
return(text);
}
}

Not all places have this issue with the "\r" character. One place where the carriage return character is automatically handled is the PSL file reading operations. The PSL read() function on a channel opened via PSL fopen() (with a non-binary mode) will not return the carriage return characters in the output. PSL read will automatically remove these characters from the returned result.

process() function (all platforms)

Operating system processes are another area where portability concerns arise. Processes are an integral part of the operating system layer; and although PSL attempts to hide some of the differences, some properties of OS processes are not abstracted.

The main abstraction layer for gathering statistical performance data about processes is the agent's process cache. The only reason it is called a cache is that it records a snapshot of the process information within the agent's memory for rapid access. This cache is a regularly updated record of operating system processes and a number of statistics about them. The process cache is available via the PSL process() function. For example, the process cache contents can be viewed via:

# List all agent process cache processes print(txt);
txt = process("*");

The process cache is built automatically via the agent. The PSL process() function is very efficient because it only operates within the agent's internal data structures and does not perform external process analysis. Behind the scenes, the agent uses a platform-specific method to gather the data.

Gathering platform-specific data might be direct operating system statistic access (for example, Windows NT) or a system command (for example, ps on some UNIX platforms). The process cache is updated according to the process cache cycle, which is by default 300 seconds (5 minutes) but can be configured on a per-agent basis by the PATROL administrator. The process cache can also be updated immediately via the built-in %REFRESHPROC CACHE agent command, which can be launched as an ad-hoc command on the system output window or via PSL, using the PSL system() function with this command.

The process cache has a number of limitations. It might be out of date, possibly by almost 300 seconds. The process cache cannot be updated for one particular process only. The whole process cache must be updated as a group. This requirement poses problems for a KM that needs to get regular up-to-date information about a small group of processes if this data is needed more frequently than the process cache update cycle. Portability of the PSL process function to all agent platforms is also somewhat limited to non-UNIX platforms. All UNIX agents should function identically, but some of the attribute fields are not always available on agents running on other platforms.

The solution for the delay in refreshing the process cache can be solved by using the PSL proc exists() function. This function always immediately queries the process tables and has up-to-date information. However, its only value is in the presence or absence of a particular process, and the function returns only a Boolean value. PSL proc exists() cannot gather any other more detailed statistics about a process. It also requires the PID of the process and cannot be used to search for processes based on the name.

Launching Child Processes (all platforms)

The affected functions are: system(), execute(), and popen().

Child process launching is an area with a number of portability concerns. The behavior of the child processes can differ, or they might not even exist on a particular platform. The KM can launch child processes into the operating system in a number of ways using any of the previously mentioned functions.

The first issue with child process launching is whether the command exists on a particular platform. For example, the Windows NT dir command fails on UNIX, and the UNIX ls command is usually absent on Windows NT.

The first level of portability is to ensure that the correct command is executed on the correct platform.

There is no easy way to achieve this execution other than to explicitly check the platform, which is possible via the appType variable. Ported KM code might often look like the following:

# What type of agent am I
platform = get("/appType");
if ( platform == "NT")
{
cmd = "dir";
} else {
# Presumably UNIX
cmd = "ls -la";
}
txt = system(cmd); # Launch the commandg
txt = system(cmd); # Launch the commandg

Child Process Error Handling (all platforms)

The affected functions are: system(), execute(), and popen(). Launching the wrong command on the wrong platform causes the command to fail. Unfortunately, PSL does not handle this issue as elegant as one would like. The agent will launch a child process attempting to run this command. On UNIX, it will typically use a shell; and on Windows NT, it will use the command interpreter to attempt to run the child process. The errors returned by the PSL system(), execute(), and popen() functions are therefore dependent on the behavior of the shell or command interpreter, and differ across platforms.

On UNIX, the text returned from a PSL system call that fails is typically a shell-specific error message. This message differs depending on whether sh, csh or ksh is the default shell for the PATROL user.

Similarly, the messages on Windows NT are from the OS command interpreter rather than PATROL-specific error message text. It is possible to check for the error message patterns, but this is not the best solution.

The PSL exit status variable can be used to determine the exit code of the child process. This exitcode can come from the shell or command interpreter, but it is generally accurate because they usually pass along the child process code or have a valid failure code if the command is not found. The exit status might have a different meaning on each OS. It is likely that a process on UNIX that exits with a 0 return code has completed successfully.

Preventing Command Failure

Prevention of command failures is not easy. There is no way in PSL to prevent an invalid command execution before launching the child process. There is no PSL function that will check the PATH environment variable and determine whether the command is valid and available.

Child Process popen() Pipes

The PSL popen() function launches a child process and creates pipes between the agent and the standard input and output of the process. PSL write() can thus send commands to the process' standard input, and PSL read() can read the results from standard output and standard error. This function is typically used with global channels to use a command to interact with the database or application that the KM is monitoring.

Launching Daemon Processes (all platforms)

The creation of child process in the background is a particular issue for a KM. The typical scenario is a KM that needs to relaunch a daemon process in a recovery action after it determines that the daemon is failing or absent. Because of the manner of child process launching using the UNIX shell on some platforms, some of the common attempts have failed and will block the call to PSL system until the child process completes. For example, below is a simple try on UNIX using the & syntax:

# UNIX backgrounding with & syntax
txt = system("/etc/my_daemon &");

This scenario sometimes works and sometimes not, depending on the UNIX platform and on the daemon itself. Some of the failing platforms occur because the shell is waiting for the file descriptors of the child process. So, the following UNIX csh syntax sometimes improves the portability by hiding the file descriptors from the shell:

txt = system("/etc/my_daemon >&/dev/null &");

This descriptor is not shell-specific, and the solution to the shell portability issues leads to either a separate shell script file that is launched separately or explicit coding of the shell in the command, such as the following:

txt = system("/bin/csh -fc \"/etc/my_daemon ". ">&/dev/null &\" ");

When all else fails, the PSL popen() function comes to the rescue. The PSL popen() function has the property that it launches the child process and immediately returns control to the agent. Therefore, you can simply call:

chan=popen("OS", "/etc/my_daemon");

This command will return control immediately to the PSL process and leaves the child process running. However, behind the scenes, all output from the daemon is stored in the agent's memory. Therefore, a better sequence is to detach the child process from the agent by closing the channel, as shown below:

chan = popen("OS", "/etc/my_daemon"); close(chan); # close channel without killing child

Note that the PSL close() function is called without a second argument. You should not use the flags in the second argument to kill the child process because that is not the goal.