Instances

Managing instances

When you develop a KM, you are writing an application management template. The application class can be compared to a real object class. Just like with object oriented programming, you will create instances of a class for each managed object.

When an application class is instantiated, each instance will show up with the menu commands and infobox commands as they need on the application class. At the same time all parameters that are defined on the application class will be created under the instance as well. If the parameters were set as active, they will immediately be scheduled for execution.

Before the first instance is created, the only code that will be executed on the agent for you application class is the prediscovery or discovery script ⁵. Again, this is one of the reasons why discovery is so special.

To create instances you must call the create() function. It is also possible to create instances using the simple discovery rules, but we will not cover that in this book because of the limited functionality and control.

Create function

The PSL create() function has the following syntax:

create(sid,label,state,msg,parent);

The following table describes the variables available in the create() function.

Variables in create() function

Variable	Description
sid	Sid of the instance you create This is the name you will have to use when you want to access this instance in the PATROL namespace.
label	Name (label) of the instance you create This only applies to the visual representation of the instance on the console. If you specify the empty string then the instance sid is used as the label. The instance label can be changed at run time (at the expense of a network packet that will be send). Usually the values for the sid and the label are the same if there is no real reason to make them different. However, they don't have to be the same. Don't just expect them to be the same and always make sure to refer to the instances sid instead of its label.
state	Initial state when created
msg	Message for event ("" = default)
parent	Instance sid for logical parent The parent of the instance is an instance itself and must always be written as /APPL/INST sid. You have to make sure the parent instance already exists before it is referenced, otherwise the child instance will not be parented properly.

Only the first argument is required, but you will have to specify three arguments to create a proper instance. The last argument is used for creating nested instances and we will discuss that later in this section.

The label of the instance can be changed at run-time and the console will update the label of the instance immediately when it is changed on the agent. This feature can be useful if you want to allow the user to use a different naming convention for the instances at run time (label instances by hostname or by IP address). You should be aware that changing the label introduces network traffic to notify the consoles about the update and the label must therefore not be changed without a good reason.

Instance creation

When you are implementing the instance creation functionality of your KM you must know that end users expect a proper response from your KM. That means, if you allow users to add instances to the instance list through a response() function, and they press OK, they expect to see the icon (or an error message) after a short time.

Waiting for the next discovery cycle is usually unacceptable. Some developers think that create() can only be done from within discovery. Nothing is further from the truth. The create() function can be called from within any PSL script that runs on the agent. The following table describes the scripts used for creating instances.

Scripts used for creating instances

Script	Description
Prediscovery and discovery	This is the classic way of creating instances. At least discovery one of the KM's in your KML will have to create instances from discovery.
Parameters	Parameters that create instances are very useful when the collection of data might lead to creation of additional instances. For example a parameter that alarms on "problem users" might as well maintain the list of instances in the "PROBLEM USER" application class.
Menu Commands	When a user explicitly asks to monitor a specific instance, one might as well do the creation of that instance right after the user input has been validated. Usually the information entered by the user is also stored in pconfig so the instances will be recreated when the agent is restarted. If creation is initiated from a menu command then you could consider moving the "initial creation" of the instances to prediscovery to minimize the code executed by discovery.
Events	This method could be used by PEMAPI consoles that would like to create instances. Also scripts using PatrolCli will eventually be executed by the event subsystem.

The only script where it is unlikely to find a create() function are infoboxes. Even the system output window can occasionally be used to debug certain situations.

Destroy function

The counterpart of the create() function is the destroy() function. Just like create(), destroy() can be called from anywhere in PSL. When you call destroy() the instance you specify will immediately be destroyed and all parameters belonging to this instance will be destroyed as well. In case of nested instances, destroying an parent instance will also destroy the child instances regardless of the application class they belong to.

Instance destruction

When the last instance of an application is destroyed, the application icon will be removed from the console as well. Because the application icon is destroyed after the last instance is destroyed, it is a good practice to always add new instances before you remove the old ones. If you first remove old instances and this list happens to be the same as the current list of instances you have then the application icon will be removed only to be recreated after the first new instance is created.

You have to be especially careful when you have written a menu command that allows a user to "destroy all instances of this application". In that case you could get the list of currently created instances by calling get("instances");. However, if you just run over this list and call destroy() for each of the instances, it is very likely that you will destroy yourself before you finish the list. In this case you have to make sure to remove yourself from the list and destroy yourself as the last instance.

Techniques for creating instances

Usually, developers start working on the create statements first, only to find out that adding instance destruction is a lot more difficult than they thought. In some cases the QA team or customers hit the problem that instance destruction is not supported in the code and most of the times this means almost a full rewrite of the code for the developer.

At the basis of the issue lies the way KM development is usually done: first, make sure it works, then make sure it works well.

In this section we will discuss the common mistakes that are made and how you can reorganize your code so your KM will work better from the start.

It is a good practice to make sure that for every create() function you write, you also have a destroy(). Just like a C-developer would provide a free () for every malloc().

Classic create loop

Typically, after you will have finished your create() logic, the code is already so big and sometimes so complex that it isn't easy to add the destroy() function. By adding destroy() logic, your code not only becomes bigger and more complex, but performance also starts suffering. For the following examples, we assume you have a way to get a list of instances you want to create. The variable wanted instances will be initialize with this new line separated list.

In many KMs, you will see code that looks like this:

# Get the list of instances we would like to have
# substitute the ... with a command that will this list
wanted instances = ...;

foreach instance (wanted instances)

{
  # Before we create the instance we should check if it wasn't
  # created before
if (! exists(instance))
  {
    create(instance,instance,OK);
  }
}

This code works fine, but it checks every instance to see whether it still needs to be created. Performance of the code depends on the number of instances; therefore, in the example, the more instances you have, the more CPU is consumed. Even if all instance have been created already, the code will check for the existence of every instance.

Classic destruction loop

If you add a destruction logic to the following code, you will soon find out that this is not so easy as it might seem.

# This is the typical destruction loop
current instances=get("instances");

foreach instance (current instances)
{
   # We have to keep a flag to check if the instance
   # already exists or not

found=0;

  # Now run over each instance and check for existence
foreach good_instance (wanted_instances)
  {

   if (instance == good instance)
    {
      # If instance was found, set the flag and exit
     found=1;
     last;
    }
   }

  # If this is not an instance we want (therefore not found),
  # destroy it
if (found == 0)
  {
    # This means an instance is in the current instance
    # list but not found in the wanted instances
   destroy (instance);
  }
}

With this code, destruction logic was added to the create-loop. Usually you will see that the wanted instances list will be more or less static (give or take a few instances). By looking a bit deeper on how this code affects performance, we can come up with the following formula: number of checks = current instances (current instances+1).

When your KM contains 50 instances, this would mean almost 2500 if-statements for every discovery cycle. The result of these checks is very likely to be nothing (in case the wanted instances list didn't change.

Optimizing the create process

Take a look at the following code and notice the use of the difference() function.

# Get the list of instances we would like to have
# substitute the ... with a command that will this list
wanted instances = ...;

# Get the list of the current instance from the namespace
current instances=get("instances");

# To find out which instances should be created, we use the PSL
# difference function
to_create=difference(wanted instances,current instances);

# This same function can be used to find the instances that should be
# destroyed.
# Read this as: What we currently have but don't want
to destroy=difference(current instances,wanted instances);

# Now create the instances
foreach instance (to create)
{

    create(instance,instance,OK);
}
    create(instance,instance,OK);
}
# And destroy the instances we don't need anymore
foreach instance (to destroy)
{

  destroy(instance);
}

This code does the same as the previous example but a lot faster. The for each loop for the create and the destroy logic is only called when something needs to be done. Additionally, the very fast (and readable) difference() function is used instead of the slower for each() loops.

For each additional instance, some extra CPU will be necessary for the difference() call, but that extra cost is compared to the extra price you would pay in the first example. Besides that you will not need to do needless checks in case no instances have been added or removed from the wanted instances list.

Create return code checking

It is a good practice in every programming language to check for the return code of the functions you use. Most of the times, the return code of the create function is not checked, although PATROL administrators like KM's that do check the return code.

# Get the list of instances we would like to have
# substitute the ... with a command that will this list
wanted instances = ...;

# Get the list of the current instance from the namespace
current instances=get("instances");

# To find out which instances should be created, we use the PSL
# difference function
to_create=difference(wanted instances,current instances);

# Now create the instances
foreach instance (to create)
{

  create(instance,instance,OK);

    # Let's presume we want to set some instance specific data
    set(instance."/mydata",somedata);
}

This code look new, but if the instance is filtered by specifying it in the /AgentSetup/<APPL>. filter list configuration variable, the create statement will fail. (More details about this variable later.)

If you didn't cater for this situation, then in the second for each statement, the set will fail and generate a run-time error. Therefore, it is better to ensure that create was successful, as demonstrated in the code below.

# Get the list of instances we would like to have
# substitute the ... with a command that will this list
wanted instances = ...;

# Get the list of the current instance from the namespace
current instances=get("instances");

# To find out which instances should be created, we use the PSL
# difference function
to_create=difference(wanted instances,current instances);

# Now create the instances
foreach instance (to create)
{

   if (create(instance,instance,OK))
   {
       # Now we can safely set the instance specific data
      set(instance."/mydata",somedata);
   }
}

Another way to prevent these types of error from occurring is by rechecking the current instance list after all creates have been done. This is specifically useful when executed from within discovery and the discovery process also acts as a collector for each of the instance. In this case it might be a good idea to split the collection logic from the create logic, as shown in the following code:

wanted instances = ...;
current instances=get("instances");
to_create=difference(wanted instances,current instances);

# Now create the instances
foreach instance (to create)
{

create(instance,instance,OK);
}

# Create might have failed, but since we want to set data
# for every instance we have anyway, just requery the "instances"
# attribute
current instances=get("instances");

# Now we can be pretty sure the instances exist
foreach instance (current instances)
{

set(instance."/Status/value",...);
}

# NOTE: For the super defensive programmers actually after re-querying
# you can't be sure the instances will exist.
# Nothing prevents someone from destroying an instance while you are
# running in that set loop. If you really want to be sure you must
# check for existence before every set call. And even then there is a
# theoretical moment that you will try to set something that has already
# been destroyed.

File check

In some cases, discovery is completely dependent on the content of a file. Because of this dependency, the file will occasionally need to be parsed to find the wanted instances list. A good way to accomplish this is to determine whether the file has been changed. If the file has changed, you can parse it and see whether you need to add or destroy instances, as shown in the following sample code.

# get the current timestamp of the file we want to check
timestamp = file(file to check);

# Retrieve the timestamp from the namespace. The first time this executes
# it will be set to ""
old timestamp=get("old_timestamp");

# Process only if the file has changed
if (timestamp != old_timestamp)
{

     # File has changed
     # Make sure to save the new timestamp
    set("old_timestamp",timestamp);

     # Check if file exists
    if (timestamp)
     {
         # file exists
          # now process the file and do your actual discovery
         do_discovery();
    }
    else
    {
         # process " file doesn't exist " (anymore)
        do_file_empty();
    }
}

In this code, the first time discovery runs, get("old_timestamp") it returns NULL (and an invisible run-time error). That means that time stamp and old time stamp will always be different unless the file does not exist. If that is the case, file() will return NULL and will be the same as old time stamp, which does not present an issue because the file will not get processed if it doesn't exist.

If the file has been processed, the time stamp will be saved. Then the discovery cycle will look for the old time stamp again and use that time stamp to check whether the file changed.

Process check

When discovery is dependent on the existence of processes, you can use the full_discovery() call which was already described in detail in the previous section. An example is shown in the code below.

# Check if the process cache has been refreshed since the last
# time this command executed
if (! full_discovery())
{
# If the process cache was not refreshed, exit
exit;
}

Create icon for class

In the PATROL console, the application class can be made visible if you select the "Create Icon for Class" on the KM property sheet. This application icon will not have any extra functionality and will always have the same name as the name of the KM. All instances that will be created will the always have an application icon as a parent. With nested instances, the application icon can be shown multiple times.

If you do not select the create icon for class then the console will not show a class icon as a parent to each instance. Selecting or clearing this toggle is only a visual effect and doesn't change anything in the way the namespace works.

If the check box is not checked then the application icon will not be created.

Nested Instances without Icon for Class

Example for create Icon for class

Nested instances

Nested instances are just a visual feature of the console, there is no additional object level introduced in the Agent tree. This is the most important thing to understand nested instances. Although instances will appear on the console as if there is a parent-child relationship, the PATROL namespace doesn't change. There are some ways to nest instances. If you don't remember the proper syntax of the create statement take another look at the beginning of this section before continuing.

When creating nested instances, you have to ensure that the instance sids for a certain application class are unique. This is sometimes overlooked, because the instances could appear under a different parent.

The following figure shows APPL B application class with four instances. The first and fourth create statements are trying to create the instance with the same sid 'X'. The first create statement creates the instance object with sid 'X' and gives the instance icon a label of 'root'. PATROL knows this object by sid ('X') not "root". The fourth create statement is requesting to create the instance object X a second time but with a label of 'tmp'. Even though the labels are different we are attempting to create multiple copies of an instance inside of an application class. This is not possible.

Nested Instances creation (icon for class)

Main map instances

The special parent instance named "/" and be used as the fifth argument (parent) in a create() function. This will place the instance on the main map besides other computers. This is very useful if your KM monitors a computer as a proxy, or if you are monitoring network components. More or less everything that doesn't really belong under the computer can go there. You have to be aware that although this looks good, some operators dislike this approach because they loose the feeling over "which host is responsible for what".

Note

It is not possible to nest instances from different agents under a specific main-map instance.

Dummy application class instances

One of the tricks you can use to get the visual effect of nesting by creating an application class with no parameters, menu commands and infoboxes. This class can be as simple as a definition of icons for the class, setting active to 2 in prediscovery and exiting immediately in discovery. Whenever another KM needs a parent, it can instantiate this dummy KM (even with different icons) and then nest itself under this parent. Be aware however, that the parent instance must already exist when you create nested instances.

Typically application classes that serve the only purpose of being a parent have a KM name of <XXX> CONT.

By creating such an application class you can also mimic the "Create Icon for Class" behavior, but still have a nicer label for the icon.

Determining the parent

To determine the parent instance name of an instance, you have the special built-in instance attribute parentInstance.

To determine children of an instance you must get all the instances of the application (using the instances attribute) and loop through the instances list comparing them to the parentInstance variable for matches.

Limiting the number of instances

When building a KM, it is always important to think about the consequences that it will have on the host OS. It is not always easy to find relations between PSL code and CPU of the agent, but it is pretty easy to see that the agent's footprint (memory, CPU) will be directly related to the number of applications, instances, and parameters.

Limiting the number of application classes will have a limited impact on the Agents CPU. The application class itself uses some memory and CPU (Prediscovery/Discovery). But it really starts getting important when you start instantiating.

For example, you have four nested KMs as follows: ¡

KM1 has 15 instances.
When you click an instance of KM1, you see 15 instances ofKM2.
When you click an instance of KM2, you see 15 instances ofKM3.
When you click on an instance of KM3, you see 15 instances of KM4.
KM4 has two standard parameters each with a scheduling interval of 1 minute.
All the other KMs do not have any parameters.

What is the total number of parameters that will run every second?

KM1: 15 instances
KM2: 225 instances
KM3: 3375 instances
KM4: 50625 instances
Total instances: 54,240 instances
Total parameters: 101,250 parameters every minute
Parameters per second: 1687

Based on this example, you can see that it is important to limit the number of instances, especially when you have nested instances, because it won't be obvious to see the total amount of instances right away on the console.

If you could rewrite the KM logic so you would be able to remove an application class (and a level of nesting), the results would be 15 times better. If you could limit the number of instances create per KM by half, your KM would also be 15 times better. Those are very significant differences.

Limitations

After a nested instance has been created under a specific parent, it is impossible to change the parent of the instance. The only way to workaround this limitation is by destroying and recreating the instance under a new parent.

Inheritance of a number of attributes does not follow the hierarchical path of its parent. Usually when you get() a variable from the namespace and the variable is unknown, the namespace will be traversed, looking for the variable. A nested instance will not traverse its way through the parent.

Events generated from a nested child do not contain the full logical path of the instance. If this logical path is really necessary you will have to retrieve the full path of the instance yourself and trigger one of your own events. We will discuss events in detail in a later section.

Instance creation pitfalls

The following sections describe a few pitfalls that you might encounter with instance creation.

Limit number of nesting levels

After you find out how to do nesting, you should still be careful not to use the feature too much. Every time you introduce nesting, you are asking the operator for an extra mouse click. Do not just nest because you know how to do it, but do it because it adds value to the KM.

Characters you should not use

Don't start the label with a ':'. When a sid starts with a colon, the console expects an icon name. Of course, when you want to specify a different icon than the one you defined in the application class, you will have to start the label with a ':'.

Avoid using'.' and '|' in your sid. The dot character is used by PEMAPI as a delimiter. The bar character is used by the layout database in the console. Never use the '/' character in the sid, since this is the namespace delimiter.

Creating invisible instances

If you create an instance of an application and you specify only 1 argument to the create() function, then the instance will be created hidden. If you want to properly create an instance, you must at least provide three arguments to the create() function.

The arguments to the PSL create function are instance name, instance label, and initial status. A common error in learning PSL is to try one single argument as follows:

create("x"); # wrong!

This will create an instance, but it is hidden because the default state in the absence of a 3rd argument is "NEW" which is an invisible state. The new instance will be created under the computer icon. You will have to drill down into a computer icon on each agent to see the new icon.

Instance filtering

Adding instance filtering by using filterlist

As previously mentioned, it is possible to set the /AgentSetup/<APPL>. filterlist configuration variable to contain the instances you want to filter out. This setting is probably the cheapest way to do filtering when you develop your KM.

The /AgentSetup/<APPL>. filterType works in tandem with the /AgentSetup/[APPL]. filterList configuration variable. Let's presume you have an application and it's discovery contains this code:

for (i=0; i<10; i++)

{
   name="inst_".int(i);
   if (create(name,name,OK))
    {
     print("Instance : ".name." create succeeded\n");
    }
   else
    {
     print("Instance : ".name." create failed\n");
    }
}

This will create 10 instances: inst 0...inst 9

Now, if you set

/AgentSetup/[APPL]. filterList = "inst_1, inst_3, inst_5"

/AgentSetup/[APPL]. filterType = "exclude"

Then these three instances will not be created, but all the rest will if you set

/AgentSetup/[APPL]. filterList = "inst_1, inst_3, inst_5"

/AgentSetup/[APPL]. filterType = "include"

Then only these three instances will be created, all the rest will be suppressed. To implement filtering out an instance using a menu command, see below:

Create a new menu command titled Filter out this instance.
Add the following PSL code in the command area:
# Get our current application name from the namespace
# This expects we are running on instance level (like in a menu command)
appl name = get("../name");

# Now get the sid. This is not the same as the "name"
inst sid = get("sid");

# Apply the configuration
pconfig("MERGE","/AgentSetup/".appl name.".filterlist", inst sid);

# Destroy our current instance. This will actually remove our
# process as well and should therefore be executed as the last function
# in our PSL script
destroy(".");
This script will add the SID of the instance to the filter list and will immediately destroy the instance.

When filtering out an instance, ensure you have a way to undo this. After you have filtered out all instances, you end up with nothing and there might not be a way in your KM to add instances or edit the filterlist if non of the instances are available. To protect from this, add a function in your KM that ensures an offline setup icon is created before the last instance is removed.

Filtering suggestions

When you see you are creating too much instances, you must introduce some sort of filter capability for your KM. The following are some suggestions for filtering criteria you could use.

Condition X

For example, monitor only the top X instances.

X could be a user defined criteria like bandwidth, %CPU, memory, number of transactions or even combinations. Conditions could be TOP, LOW, HIGH, BETWEEN, and so on. This would allow the end-user to specify the important criteria and at the same time would allow the end-user to remove the instances that are not interesting enough.

Show only NOT OK instances

This is really useful for user monitoring. The operator might not care about a certain user unless there is an issue with that user. This type of monitoring is also called desired state monitoring, because only instances of a certain state are shown. Don't forget that "not ok" doesn't necessarily mean ALARM. For example, a backup-connection that went from offline to OK is probably an indication that something is wrong (why else would we have switched to backupmode).

Remove instead of filter

Sometimes instances have been historically created, and no one ever has bothered to remove them. If an instance is only showing up because it has always been this way, Find out if it would make sense just removing the instances. Maybe the data reported under the instance is only needed as a reporting tool and history is even never used.

Consider moving the information in the parameters to a report that can be requested by executing a menu command. For our 4 KM example, this would mean to remove the parameters in KM4 (and KM4 altogether), but instead add a report menu command that will return the data the user would otherwise get by clicking on the parameters. This would also change the collection of data from a "scheduled collection" to a "collect on request". In KM3,you can then create some summary parameters of the parameters previously shown under KM4 ⁶.

Allow a configurable limit

In certain cases, it is a good idea to add a limit to the number of instances that will ever be created. This can be useful in situations where you can potentially end up in a runaway instance creation cycle. (Instance are created because something is going wrong with the application).

Instance pitfalls

The following content is related to the Transient instance history issue.

Transient instance history issue

When history expires, what happens to the data in the param.hist file? Let's say you have a file system instance that exists for only a few moments and is then unmounted or excluded. Would the history for the parameters of that instance persist in the history file forever? Does data ever get "erased"? Can the history file go down in size?

You could think of the PATROL History database as a "Circular" file, where the latest data is written, and the expired data is written over. The history database could increase in size if you added other parameters/application classes or increased the retention period of any of your parameters.

But then what about reducing the file size? The issue is that one would like to achieve certain effects with custom KMs without blowing up the history file. This is proving rather difficult. For example, occasionally bringing in unique and temporary file systems would slowly bring the size of the history file up. There would be no certain limit. Is there a way to store history on continually changing instances without having the history file run away? This description of the issue is commonly known as "history pollution".

There is a method you can use to work around this, but it might not be acceptable for all cases. Let's say you would like to monitor processes and display the combination of process name + PID and maintain some history about them.

This will be a very good example to get maximum history pollution, since the process name and PID is an extremely unique thing, and it's unlikely that this combination will ever reoccur after the process has died.

It's important to know that the history uses the combination of /APPL/SID/PARAM as an index to store the historical data.

What the end user will see is on the console is not the instanceID, but the instance label (which is the "name" in the namespace), and that is something we can use. Just think of an instance ID as a "slotnumber" and the instance label as the thing we want the end user to see. Applied to our processes example, you could come up with something like this:

/APPL/SID =

-/PROCESS/SLOT1

-/PROCESS/SLOT2 -

/PROCESS/SLOT3

However, the visible instance names would be

/APPL/INST=

-/PROCESS/inetd-445

-/PROCESS/ksh-336

-/PROCESS/xterm-776

As long as the number of processes will be reasonable, we will not pollute the history more than necessary. (we will only create to number of slots we need, and reclaim them whenever one becomes available)

Of course this feature might sound very good, but there is an issue with this approach as well. If process A was using slot 1 and it dies, it becomes available for a new discovered process, let's say process B.

If we now ask for history of a parameter belonging to process B, we will not only get the history from the B process, but also the old history of the A process (or even every old process that occupied the slot in the past). That means we have to provide addition info on the graph, so we won't confuse our end user what information he's actually looking at. One way to do that would be to annotate every time a process occupies slot and when it releases the slot.

Another way would be to define a certain "impossible" value for each of your parameters and set the parameter to this impossible value whenever a switch occurs, or whenever no-one is occupying the slot.

Of course, it would also be possible to use a combination of both methods explained above. Maybe it would be better to go into the details of how the agent stores history.

The history file uses blocks to store data for parameter/instance values. Over time, blocks are freed (because of history retention) for use with new data, but the new space allocated will only be for values of the same parameter/instance. In other words, if you have an instance /FILESYSTEM/tempfs01 and store history on it, when the history expires, there will be room allocated in param.hist for /FILESYSTEM/tempfs01 (only) even if that instance does not exist anymore. There is currently no direct method in PSL to handle this issue.

This is certainly different than how most people think it works. Usually, one assumes that after the history for /CLASSx/INSTx/PARx was "outofdate" then the space it took was available for any other /CLASSx/INSTx/PARx to ll.

Actually it is quite logical that this is not the case. The history file works just like a database. It is indeed the case that whenever history is outdated, the occupied space will become available again for other /CLASSx/INSTx/PARx values.

The issue lies somewhere else. When you set history retention period to 7 days, that means the oldest datapoint for a certain /CLASSx/INSTx/PARx will not be removed from the history database if the time difference between the oldest datapoint and the newest datapoints is less than 7 days.

In case INSTx is reasonably unique (for example, "PID-PROCESSNAME" combination), and the history retention period is 7 days, you will eventually end up with a huge history file, because after the process goes away, the history will remain.

Maybe another example is better: Let's say we have a process that restarts every 2 days. That means every two days it will be assigned another PID. The data in the history database will not be cleaned up until there are 7 days worth of history. That means the index and data in the history database will never be removed.

Best case scenario, the process continues to run and will only occupy 7 days worth of data, since the history works like a round robin database data which is older than now-7days will be overwritten with new data.

Whenever the process dies (and the instance is removed), your history data will not be cleaned up, because the rule is to keep 7 days worth of data, so it's sitting there and you can't access it.

For most KM's this is OK, for example a file system is unmounted, instance is removed, file system is remounted, instance is recreated, and the "old" data is still available.

However, with our example, it is very unlikely that the combination PID-PROCESSNAME will ever reoccur and your data will just be sitting there.

Since no extra data is added, there is no reason for the agent to clean it up (as mentioned before, history will be cleaned when is the delta between timevalues of old and new datapoints are greater then the history retention period).

Of course it doesn't matter how many times a year the process restarts, it is important to know that the agent will store (and not necessarily free) data for each /CLASSx/INSTx/PARx combination. The only freeing process that happens is either console triggered (developer console can call "Clear History"), or automatically when the time range of a parameter exceeds the history retention period.