Information

This site will undergo a brief period of maintenance on Friday, 18 December at 12:30 AM Central/12:00 PM IST. During a 30 minute window, site availability may be intermittent.

Troubleshooting cell rules and policies


Content contributed by Steve Mundy.

Both the rules and policies provide a mechanism to perform actions against events. However, unlike rules, which are stored in the .mrl files in pw/server/etc/<cell>/kb/rules directory and are written in Master Rule Language (MRL), event policies are created using the policy editors available from the Administration tab of the BMC ProactiveNet Performance Manager Administration Console.

Event policies also differ from rules in that the policy instance employs an event selector that allows specification of a number of events that meets selection criteria, giving the event policy greater flexibility.

For more information about rules and policies, see the BMC Knowledge Base Development Reference Guide.

Location of rules and policies

  • Rules can be found in the pw/server/etc/<cell>\rules directory, in the .mrl files.
  • Policies are stored in the cell state file (mcdb), in pw/server/var/<cell>/mcdb. The default policies can also be found in pw/server/etc/<cell>/kb/data/di_policies.baroc. As policies rely on selectors, while restoring the default policies, you first might need to restore di_selectors.baroc.

Verifying rule and policy configuration

  • For rules, ensure that the mrl file is listed in the pw/server/etc/<cell>/kb/rules/.load file. The position in the file does matter, and its type (refine, filter, regulate etc). Check whether you can recompile the KB without any error messages. If there is any error message, fix it syntactically (if you can) or review the BMC Knowledge Base Development Reference Guide.
  • For policies, ensure that they are enabled (via the GUI or via mquery).

Enabling cell rule tracing

For troubleshooting cell rule and policy related problems, turn cell rule tracing ON as follows:

  1.  Modify pw/server/etc/<cell>\mcell.conf to contain: TraceRuleLevel=2
  2. Modify pw/server/etc/<cell>/mcell.trace to contain: ALL ALL stderr

    Warning

    Note

    All other lines in the file should be commented out. If that file does not exist then modify the one under pw/server/etc directory instead.

  3. Restart the cell

This can also be done dynamically without restarting the cell:

mcfgtrace -n <cell> ALL ALL stderr

mcontrol -n <cell> tracerule on

And to disable:

mcfgtrace -n <cell> ALL VERBOSE no

mcontrol -n <cell> tracerule off

Warning

Note

Setting the TraceRulelevel to 2 if tracing is enabled via mcell.conf and mcell.trace, or with the "mcontrol -n <cell> tracerule on" command, allows you to locate in the trace the .mrl filename and also the name of the rule which performs specific actions to the event.

Enabling rule trace dynamically negates the need to restart the cell. To assist in investigating the rule problem, and the cell trace, following are required:

Cell KB, located at pw/server/etc/<cell>

Cell xact and state file located at pw/server/var/<cell>

Also, before analyzing the rule trace, it is important to understand the different event processing rule phases as shown in the following diagram:

Event processing rule phases.png

The different phases are explained in the Chapter 1, page 26 of the BMC Knowledge Base Development Reference Guide.

#

Rule phase

Description

1

Refine

Validates incoming events and, if necessary, collects additional data needed before the event is processed further

2

Filter

Identifies events that should be discarded

3

Regulate

Evaluates events, and, if evaluated as true, collects duplicate events for a time period. If a specified threshold of duplicates is reached, the Regulate phase passes an event to the next processing phase.

4

New

Determines which events in the event repository should be updated with new information from new incoming events. During this phase following tasks are observed:

  • Actions are triggered that must be performed just before a new event comes in
  • Previously received events are updated, and the new event optionally may be dropped
Warning

Note

 This is the last opportunity to prevent an event from entering the event repository.

5

 Abstract

Evaluates events, and, if certain conditions are met, triggers the generation of abstraction events. An abstraction event is a summary event based on other events that are occurring

6

Correlate 

Determines whether any events have a cause-and-effect relationship 

7

 Execute

Specifies actions to perform when a slot of a new event matches a condition, or a slot of an old event is modified to satisfy a condition

8

Threshold 

Specifies the actions that must be performed when a certain number of duplicate events have been received over a certain time period

9

 Propagate

 Determines whether an event is forwarded to another cell or integration product

10

 Timer

Specifies actions to be executed when a timer has expired. A timer can be set in the New, Abstract, Correlate, Execute, Threshold and Delete phases.

11

Delete

Triggers actions to ensure that data integrity is maintained when an event is deleted from the event repository during the cleanup process

We also need to take into account the order in which the rules are listed in the pw/server/etc/<cell>\kb\rules\.load file.

Analyzing rule trace

In this scenario, an event is sent, where mc_object_class and msg slot have "Windows" as value. Rule tracing is also turned ON.

msend -n sim -a SAM -b "mc_object_class=Windows" -m Windows

But in the BMC ProactiveNet User Console, the msg slot has changed (it should be "Windows") and the following text is displayed: This is Windows please reassign to Sam TRUONG.

Following is the BAROC export of the event:

SAM;
            event_handle=2907;
            mc_ueid='mc.sim.10570647.0';
            mc_client_address='192.168.0.200';
            adapter_host='';
            mc_location='sam.com';
            mc_service='';
            mc_host_class='';
            mc_host='sim.sam.com';
            mc_host_address='192.168.0.200';
            mc_host_id=0;
            mc_account='';

            mc_object_class='Windows';

            mc_object='';
            mc_object_uri='';
            mc_object_owner='';
            mc_tool_class='';
            mc_tool='';
            mc_tool_id='';
            mc_tool_rule='';
            mc_tool_key='';
            mc_tool_sev='';
            mc_tool_address='192.168.0.200';
            mc_tool_uri='';
            mc_tool_time=1347880519;
            mc_tool_suggestion='';
            mc_origin_class='';
            mc_origin='';
            mc_origin_key='';
            mc_origin_sev='';
            mc_parameter='';
            mc_parameter_value='';
            mc_parameter_unit='';
            mc_parameter_threshold='';
            mc_event_category=;
            mc_event_subcategory=OTHER;
            mc_event_model_version='1.1';
            mc_incident_time=0;
            mc_incident_report_time=1347880519;
            mc_arrival_time=1347880519;
            mc_local_reception_time=1347880519;
            date_reception=1347880519;
            date='20120917131519.000000+060';
            status=OPEN;
            severity=WARNING;
            mc_original_severity=WARNING;
            mc_priority=PRIORITY_5;
            mc_original_priority=PRIORITY_5;
            mc_owner='Sam TRUONG';
            mc_long_msg='';

            msg='This is Windows please reassign to Sam TRUONG';

            duration=0;
            mc_timeout=0;
            repeat_count=0;
            mc_action_count=0;
            administrator='';
            mc_acl=[];
            mc_date_modification=1347880519;
            mc_notes=[];
            mc_operations=[];
            mc_notification_history=[];
            mc_bad_slot_names=[];
            mc_bad_slot_values=[];
            mc_history=[];
            mc_modhist=[];
            mc_propagations=[];
            mc_collectors=['1.1','2.2.1','6.1.1'];
            mc_abstraction=[];
            mc_abstracted=[];
            mc_associations=[];
            mc_cause=0;
            mc_effects=[];
            mc_event_relations=[];
            mc_relation_source='';
            mc_smc_id='';
            mc_smc_alias='';
            mc_smc_impact=NOT_ELECTED;
            mc_smc_type='';
            mc_smc_priority=0.000000e+000;
            mc_smc_causes=[];
            mc_smc_effects=[];
            itsm_category='';
            itsm_type='';
            itsm_item='';
            itsm_product_name='';
            itsm_model_version='';
            itsm_manufacturer='';
            itsm_operational_category1='';
            itsm_operational_category2='';
            itsm_operational_category3='';
            itsm_company='';
            itsm_location='';
            pn_detail_diag=0;
            pn_detail_diag_count=0;
            pn_device_name='';
            IT_person='Sam TRUONG';
END

The .load file is displayed as follows:

kbversem
kbverssim
mc_startup
im_internal
mc_intevt
impact_admin_server
ips
mc_sm_start
mc_sm_associate
mc_ci_policies
mc_sm_maintenance
mc_sm_elect
mc_sm_attach
mc_sm_shadow
mc_sm_slm
nopass
bii4p
admin_propagate
patrol_portal
ibrsd_event_association
ibrsd_event_incident_info
ibrsd_error_event_association
ibrsd_bem_categorization
ibrsd_sim_remote_action
ibrsd_orphaned_root_cause
ibrsd_root_cause
ibrsd_causal_priority
ibrsd_propagate_causal_events
ibrsd_closeoriginal
sam
sam_refine
sam_propagate

The cell trace is displayed as follows when the cell receives the event:

20120917 131519.818000 mcell: EVTPROC: BMC-IMC090101V: Received message #1 EVENT [255] :SAM;

            msg=Windows;

            mc_object_class=Windows;

            mc_host_address=192.168.0.200;
            mc_host=sim.sam.com;
            mc_arrival_time=1347880519;
            mc_incident_report_time=1347880519;
            mc_event_model_version=1.1;
            mc_tool_time=1347880519;
            mc_tool_address=192.168.0.200;
END

Both the msg and mc_object_class slots still have "Windows" as value.

Now, the different refine rules within im_internal.mrl, mc_sm_associate.mrl, ibrsd_bem_categorization.mrl will be processed against this event. This is because the Refine rule is the first event rule in the processing phase. Also, im_internal.mrl in placed above the others, in the pw/server/etc/<cell>/kb/rules\.load file.

You can use the value of the event_handle and mc_ueid slots to locate the event in the trace. In the example below:

Event_handle= 2907

Mc_ueid= mc.sim.10570647.0

20120917 131519.865000 mcell: MESSAGES: BMC-IMC046014V: Send message #31.1 ANSWER [12]
20120917 131519.865000 mcell: SERVICE: BMC-IMC050117V: Sent message #31.1 answering #31.1

20120917 131519.865000 mcell: BAROC: BMC-IMC032262V: Upgrade event #2907 @0x2098ba0 to context Refine

20120917 131519.865000 mcell: EVTPROC: BMC-IMC090102V: Start refine on event #2907 @0x2098ba0
20120917 131519.865000 mcell: RULES: BMC-IMC110801I: im_internal.mrl, 15: refine im_internal_lowercase_hostname: SAM #2907: Rule execution starting with

$EV = 0x2098ba0 (class: SAM,   event_handle: 2907, mc_ueid: mc.sim.10570647.0)

20120917 131519.865000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 18: refine im_internal_lowercase_hostname: SAM #2907: $EV.mc_host = 'sim.sam.com'
20120917 131519.865000 mcell: RULES: BMC-IMC110801I: im_internal.mrl, 102: refine im_internal_default_location: SAM #2907: Rule execution starting with
$EV = 0x2098ba0 (class: SAM,   event_handle: 2907, mc_ueid: mc.sim.10570647.0)
20120917 131519.865000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 111: refine im_internal_default_location: SAM #2907: $HLEN = 0xb
20120917 131519.865000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 112: refine im_internal_default_location: SAM #2907: $ZPOS = 0x4
20120917 131519.865000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 119: refine im_internal_default_location: SAM #2907: $EV.mc_location = 'sam.com'
20120917 131519.865000 mcell: RULES: BMC-IMC110801I: mc_sm_associate.mrl, 17: refine initialization_steps: SAM #2907: Rule execution starting with
$RAW_EVENT = 0x2098ba0 (class: SAM,   event_handle: 2907, mc_ueid: mc.sim.10570647.0)
20120917 131519.865000 mcell: RULES: BMC-IMC110203I: mc_sm_associate.mrl, 20: refine initialization_steps: SAM #2907: $RAW_EVENT.mc_smc_type = ''
20120917 131519.881000 mcell: RULES: BMC-IMC110801I: ibrsd_bem_categorization.mrl, 6: refine create_index_for_dda_match: SAM #2907: Rule execution starting with
$NEW = 0x2098ba0 (class: SAM,   event_handle: 2907, mc_ueid: mc.sim.10570647.0)
20120917 131519.881000 mcell: RULES: BMC-IMC110203I: ibrsd_bem_categorization.mrl, 8: refine create_index_for_dda_match: SAM #2907: $IBRSD_RecordIndex.Index = ':Windows'
Looking further down in the trace, we can see the new value of the msg slot is set by the IT_Person_Windows rule, located in the sam_refine.mrl file
20120917 131519.881000 mcell: RULES: BMC-IMC110801I: sam_refine.mrl, 1: refine IT_Person_Windows: SAM #2907: Rule execution starting with
$EV = 0x2098ba0 (class: SAM,   event_handle: 2907, mc_ueid: mc.sim.10570647.0)

20120917 131519.881000 mcell: RULES: BMC-IMC110203I: sam_refine.mrl, 3: refine IT_Person_Windows: SAM #2907: $EV.IT_person = 'Sam TRUONG'

20120917 131519.881000 mcell: RULES: BMC-IMC110203I: sam_refine.mrl, 4: refine IT_Person_Windows: SAM #2907: $EV.mc_owner = 'Sam TRUONG'

20120917 131519.881000 mcell: RULES: BMC-IMC110203I: sam_refine.mrl, 5: refine IT_Person_Windows: SAM #2907: $EV.msg = 'This is Windows please reassign to Sam TRUONG'

20120917 131519.896000 mcell: RULES: BMC-IMC110005V: Stop refine event #2907 @0x2098ba0 : CONT

The above rule is responsible for the new value of the msg slot. The sam_refine.mrl file is displayed as follows:

refine IT_Person_Windows: SAM($EV) where [$EV.mc_object_class==Windows]

{
$EV.IT_person = 'Sam TRUONG';
 $EV.mc_owner = $EV.IT_person;
 $EV.msg=concat('This is ',string($EV.mc_object_class),' please reassign to ', string($EV.IT_person));
 }

END

The rule applies to the SAM event class and if mc_object_class = Windows, then the following values will be assigned to the IT_person, mc_owner and msg slots:

$EV.IT_person = 'Sam TRUONG';
$EV.mc_owner = $EV.IT_person;
$EV.msg=concat(['This is ',string($EV.mc_object_class),' please reassign to ', string($EV.IT_person)]);

Which confirms what is previously mentioned in the cell trace.

Events are getting dropped by cell

Here the scenario is that some events are not visible in BMC ProactiveNet User Console. Issuing mquery command does not return the events either. The issue needs to be reproduced with the cell trace enabled. Event is received by cell as follows:

20120917 192737.078000 mcell: EVTPROC: BMC-IMC090101V: Received message #1 EVENT [1229] :PATROL_EV;
            p_agent=hermes;
            p_agent_address='192.168.0.210';
            mc_host_class='NT 5.2 Microsoft Windows Server 2003 Enterprise Edition (Service Pack 2) x86';
            p_agent_version='V3.7';
            p_agent_port=3181;
            mc_origin_class='V3.7';
            adapter_host='sim.sam.com';
            mc_tool_class='BII4PATROL 7.3.60';
            mc_origin_key=1274;
            p_status=0;
            status=OPEN;
            p_type=5;
            severity=CRITICAL;
            mc_origin_sev=4;
            p_args=['Alarm #2',global,CPUprcrProcessorTimePercent,'NT_CPU.CPU_0',95,'100.00',100];
            p_origin='NT_CPU.CPU_0.CPUprcrProcessorTimePercent';
            p_application=NT_CPU;
            p_instance=CPU_0;
            mc_parameter=CPUprcrProcessorTimePercent;
            p_catalog=STD;
            p_class=11;
            mc_parameter_value='100.00';
            p_class_group=param_group;
            msg='Alarm #2 of global parameter ''CPUprcrProcessorTimePercent'' triggered on ''NT_CPU.CPU_0''.  95 <= 100.00 <= 100';
            mc_host=hermes;
            mc_tool_key=1274;
            mc_incident_time=1341937179;
            p_source_id=11;
            mc_object_class=NT_CPU;
            mc_object=CPU_0;
            mc_host_address='192.168.0.210';
            mc_origin='hermes:3181';
            mc_ueid='BII4Patrol/192.168.0.210/3181/1341937179/1274';
            mc_arrival_time=1347902856;
            mc_incident_report_time=1347902856;
            mc_event_model_version='1.1';
            mc_tool_time=1347902856;
            mc_tool_address='192.168.0.200';
END
20120917 192737.109000 mcell: MESSAGES: BMC-IMC046014V: Send message #11.1 ANSWER [12]
20120917 192737.109000 mcell: SERVICE: BMC-IMC050117V: Sent message #11.1 answering #11.1
20120917 192737.125000 mcell: BAROC: BMC-IMC032262V: Upgrade event #3062 @0x21185a8 to context Refine
20120917 192737.125000 mcell: EVTPROC: BMC-IMC090102V: Start refine on event #3062 @0x21185a8
20120917 192737.125000 mcell: RULES: BMC-IMC110801I: im_internal.mrl, 15: refine im_internal_lowercase_hostname: PATROL_EV #3062: Rule execution starting with
$EV = 0x21185a8 (class: PATROL_EV,   event_handle: 3062, mc_ueid: BII4Patrol/192.168.0.210/3181/1341937179/1274)
20120917 192737.125000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 18: refine im_internal_lowercase_hostname: PATROL_EV #3062: $EV.mc_host = hermes
20120917 192737.125000 mcell: RULES: BMC-IMC110801I: im_internal.mrl, 102: refine im_internal_default_location: PATROL_EV #3062: Rule execution starting with
$EV = 0x21185a8 (class: PATROL_EV,   event_handle: 3062, mc_ueid: BII4Patrol/192.168.0.210/3181/1341937179/1274)
20120917 192737.125000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 111: refine im_internal_default_location: PATROL_EV #3062: $HLEN = 0x6
20120917 192737.125000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 112: refine im_internal_default_location: PATROL_EV #3062: $ZPOS = 0x0
20120917 192737.125000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 115: refine im_internal_default_location: PATROL_EV #3062: $EV.mc_location = Unknown
20120917 192737.125000 mcell: RULES: BMC-IMC110801I: im_internal.mrl, 156: refine dden_policy_rule: PATROL_EV #3062: Rule execution starting with
$EV = 0x21185a8 (class: PATROL_EV,   event_handle: 3062, mc_ueid: BII4Patrol/192.168.0.210/3181/1341937179/1274)
$POL = 0x20f7c68 (class: IM_DDEN_POLICY,   data_handle: 297, mc_udid: mc.sim.d3f120e.294)
20120917 192737.125000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 165: refine dden_policy_rule: PATROL_EV #3062: $L = [NT_CPU]
20120917 192737.140000 mcell: EVTPROC: BMC-IMC090116V: Modified slots of event #3062 @0x21185a8 to Rule Engine
20120917 192737.140000 mcell: RULES: BMC-IMC110801I: mc_sm_associate.mrl, 17: refine initialization_steps: PATROL_EV #3062: Rule execution starting with
$RAW_EVENT = 0x21185a8 (class: PATROL_EV,   event_handle: 3062, mc_ueid: BII4Patrol/192.168.0.210/3181/1341937179/1274)
20120917 192737.140000 mcell: RULES: BMC-IMC110203I: mc_sm_associate.mrl, 20: refine initialization_steps: PATROL_EV #3062: $RAW_EVENT.mc_smc_type = ''
20120917 192737.140000 mcell: RULES: BMC-IMC110202I: mc_sm_associate.mrl, 60: refine find_most_specific_formulas_and_build_mc_smc_id1: PATROL_EV #3062: solution 1 to data query:
$ALIAS = 0x2116e78 (class: BMC_SIM_ALIAS,   data_handle: 1172, mc_udid: mc.sim.d877dc7.55)
20120917 192737.140000 mcell: RULES: BMC-IMC110801I: mc_sm_associate.mrl, 54: refine find_most_specific_formulas_and_build_mc_smc_id1: PATROL_EV #3062: Rule execution starting with
$RAW_EVENT = 0x21185a8 (class: PATROL_EV,   event_handle: 3062, mc_ueid: BII4Patrol/192.168.0.210/3181/1341937179/1274)
$ALIAS = 0x2116e78 (class: BMC_SIM_ALIAS,   data_handle: 1172, mc_udid: mc.sim.d877dc7.55)
20120917 192737.140000 mcell: RULES: BMC-IMC110203I: mc_sm_associate.mrl, 74: refine find_most_specific_formulas_and_build_mc_smc_id1: PATROL_EV #3062: $RAW_EVENT.mc_smc_alias = hermes
20120917 192737.140000 mcell: RULES: BMC-IMC110203I: mc_sm_associate.mrl, 75: refine find_most_specific_formulas_and_build_mc_smc_id1: PATROL_EV #3062: $RAW_EVENT.mc_smc_id = 'OI-FD2173E6DF7A4B59B9687CC60D27A596'
20120917 192737.140000 mcell: RULES: BMC-IMC110202I: mc_sm_associate.mrl, 121: refine associate_new_event: PATROL_EV #3062: solution 1 to data query:
$COMP = 0x2117470 (class: BMC_ComputerSystem,   data_handle: 1171, mc_udid: OI-FD2173E6DF7A4B59B9687CC60D27A596)
20120917 192737.140000 mcell: RULES: BMC-IMC110801I: mc_sm_associate.mrl, 113: refine associate_new_event: PATROL_EV #3062: Rule execution starting with
$RAW_EVENT = 0x21185a8 (class: PATROL_EV,   event_handle: 3062, mc_ueid: BII4Patrol/192.168.0.210/3181/1341937179/1274)
$COMP = 0x2117470 (class: BMC_ComputerSystem,   data_handle: 1171, mc_udid: OI-FD2173E6DF7A4B59B9687CC60D27A596)
20120917 192737.140000 mcell: RULES: BMC-IMC110203I: mc_sm_associate.mrl, 127: refine associate_new_event: PATROL_EV #3062: $RAW_EVENT.mc_smc_type = BMC_ComputerSystem
20120917 192737.156000 mcell: RULES: BMC-IMC110203I: mc_sm_associate.mrl, 132: refine associate_new_event: PATROL_EV #3062: $RAW_EVENT.mc_host_id = 0x0
20120917 192737.156000 mcell: RULES: BMC-IMC110801I: bii4p.mrl, 220: refine adapt_mc_tool: PATROL_EV #3062: Rule execution starting with
$EV = 0x21185a8 (class: PATROL_EV,   event_handle: 3062, mc_ueid: BII4Patrol/192.168.0.210/3181/1341937179/1274)
20120917 192737.156000 mcell: RULES: BMC-IMC110203I: bii4p.mrl, 225: refine adapt_mc_tool: PATROL_EV #3062: $EV.mc_tool = 'hermes:3181'
20120917 192737.156000 mcell: RULES: BMC-IMC110203I: bii4p.mrl, 226: refine adapt_mc_tool: PATROL_EV #3062: $EV.mc_tool_key = '1274'
20120917 192737.156000 mcell: RULES: BMC-IMC110203I: bii4p.mrl, 227: refine adapt_mc_tool: PATROL_EV #3062: $EV.mc_tool_sev = '4'
20120917 192737.156000 mcell: RULES: BMC-IMC110801I: ibrsd_bem_categorization.mrl, 6: refine create_index_for_dda_match: PATROL_EV #3062: Rule execution starting with
$NEW = 0x21185a8 (class: PATROL_EV,   event_handle: 3062, mc_ueid: BII4Patrol/192.168.0.210/3181/1341937179/1274)
20120917 192737.156000 mcell: RULES: BMC-IMC110203I: ibrsd_bem_categorization.mrl, 8: refine create_index_for_dda_match: PATROL_EV #3062: $IBRSD_RecordIndex.Index = 'NT 5.2 Microsoft Windows Server 2003 Enterprise Edition (Service Pack 2) x86:NT_CPU'
20120917 192737.156000 mcell: RULES: BMC-IMC110005V: Stop refine event #3062 @0x21185a8 : CONT
The line above shows that the event has gone through the Refine phase just fine. It's now entering the Filter phase.
20120917 192737.171000 mcell: BAROC: BMC-IMC032262V: Upgrade event #3062 @0x21185a8 to context Filter

20120917 192737.171000 mcell: RULES: BMC-IMC110801I: nopass.mrl, 5: filter TEST: PATROL_EV #3062: Rule execution starting with

$THIS = 0x21185a8 (class: PATROL_EV,   event_handle: 3062, mc_ueid: BII4Patrol/192.168.0.210/3181/1341937179/1274)

20120917 192737.171000 mcell: FILTER: BMC-IMC093101V: Applicable filter nopass__TEST:1

20120917 192737.171000 mcell: FILTER: BMC-IMC093104V: Event discarded by NOPASS filter set nopass__TEST: filter 1

20120917 192737.171000 mcell: EVTPROC: BMC-IMC090110V: Removed event #3062 @0x21185a8

20120917 192737.187000 mcell: EVTPROC: BMC-IMC090109V: Event #3062 @0x21185a8 discarded

And here, the filter TEST, which applies to the PATROL_EV event class, in the nopass.mrl file, was responsible for dropping the event. The nopass.mrl file is displayed as follows:

filter TEST : NOPASS

PATROL_EV where [$THIS.mc_object_class==NT_CPU AND $THIS.mc_host == hermes]

END

When enabled, this filter will drop all PATROL_EV events which have mc_object_class slot = 'NT_CPU' and mc_host slot = 'hermes', and the event meets the following criteria:

Checking the original event to confirm:

PATROL_EV;
            p_agent=hermes;
            p_agent_address='192.168.0.210';
            mc_host_class='NT 5.2 Microsoft Windows Server 2003 Enterprise Edition (Service Pack 2) x86';
            p_agent_version='V3.7';
            p_agent_port=3181;
            mc_origin_class='V3.7';
            adapter_host='sim.sam.com';
            mc_tool_class='BII4PATROL 7.3.60';
            mc_origin_key=1274;
            p_status=0;
            status=OPEN;
            p_type=5;
            severity=CRITICAL;
            mc_origin_sev=4;
            p_args=['Alarm #2',global,CPUprcrProcessorTimePercent,'NT_CPU.CPU_0',95,'100.00',100];
            p_origin='NT_CPU.CPU_0.CPUprcrProcessorTimePercent';
            p_application=NT_CPU;
            p_instance=CPU_0;
            mc_parameter=CPUprcrProcessorTimePercent;
            p_catalog=STD;
            p_class=11;
            mc_parameter_value='100.00';
            p_class_group=param_group;
            msg='Alarm #2 of global parameter ''CPUprcrProcessorTimePercent'' triggered on ''NT_CPU.CPU_0''.  95 <= 100.00 <= 100';

            mc_host=hermes;

            mc_tool_key=1274;
            mc_incident_time=1341937179;
            p_source_id=11;

            mc_object_class=NT_CPU;

            mc_object=CPU_0;
            mc_host_address='192.168.0.210';
            mc_origin='hermes:3181';
            mc_ueid='BII4Patrol/192.168.0.210/3181/1341937179/1274';
            mc_arrival_time=1347902856;
            mc_incident_report_time=1347902856;
            mc_event_model_version='1.1';
            mc_tool_time=1347902856;
            mc_tool_address='192.168.0.200';
END

Preceding are some simple examples but when analyzing more complex rule issues, the principle remains the same.

Troubleshooting policy related issues

As custom policies are stored in the mcdb, you need to either open mcdb using a text editor to search for them or use the mquery command to query the cell.

Here is an example of an enrichment policy:

IM_ENRICHMENT_POLICY;
            data_handle=1298;
            mc_udid='mc.sim.10576339.0';
            mc_creation_time=1347904313;
            mc_modification_time=1347904313;
            mc_modification_request_time=1347904313;
            mc_modification_requestor=struong;
            publish_env_id='';
            name=Sam_Windows_Pol;
            description=Sam_Windows_Pol;
            enabled=1;
            active_timeframes=[];
            except_timeframes=[];
            active_global_timeframes=[];
            except_global_timeframes=[];
            selector_name=SAM_Windows;
            selector_class=SAM;
            selector_ecf='*SAM_Windows ($EV)';
            ordinal=0;
            during_schedule=YES;
            override_msg=1;
            override_event_severity=0;
            override_event_priority=0;
            override_event_category=0;
            override_object_type=0;
            override_location=0;
            override_service=0;
            msg_format='This is Windows. Please reassign to Sam TRUONG';
            msg_slots=[];
            event_severity=UNKNOWN;
            event_priority=PRIORITY_5;
            event_category='';
            object_type='';
            location='';
            service='';
END

The following mquery command would allow you to retrieve the same policy:

mquery -n <cell> -a IM_ENRICHMENT_POLICY -d -w "name: contains Sam" -f BAROC
mquery -n <cell> -a IM_ENRICHMENT_POLICY -d -w "selector_class: == SAM"-f BAROC
mquery -n <cell> -a IM_ENRICHMENT_POLICY -d -w "selector_name: == SAM_Windows"-f BAROC

Here is what cell trace shows when an event is enriched by a policy:

20120917 195339.781000 mcell: EVTPROC: BMC-IMC090101V: Received message #1 EVENT [255] :SAM;
            msg=message;
            mc_object_class=Windows;
            mc_host_address=192.168.0.200;
            mc_host=sim.sam.com;
            mc_arrival_time=1347904419;
            mc_incident_report_time=1347904419;
            mc_event_model_version=1.1;
            mc_tool_time=1347904419;
            mc_tool_address=192.168.0.200;
END
20120917 195339.828000 mcell: MESSAGES: BMC-IMC046014V: Send message #7.1 ANSWER [12]
20120917 195339.828000 mcell: SERVICE: BMC-IMC050117V: Sent message #7.1 answering #7.1
20120917 195339.828000 mcell: BAROC: BMC-IMC032262V: Upgrade event #3075 @0x208c8c0 to context Refine
20120917 195339.828000 mcell: EVTPROC: BMC-IMC090102V: Start refine on event #3075 @0x208c8c0
20120917 195339.828000 mcell: RULES: BMC-IMC110801I: im_internal.mrl, 15: refine im_internal_lowercase_hostname: SAM #3075: Rule execution starting with
$EV = 0x208c8c0 (class: SAM,   event_handle: 3075, mc_ueid: mc.sim.105763a3.0)
20120917 195339.828000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 18: refine im_internal_lowercase_hostname: SAM #3075: $EV.mc_host = 'sim.sam.com'
20120917 195339.828000 mcell: RULES: BMC-IMC110801I: im_internal.mrl, 32: refine im_internal_enrichment: SAM #3075: Rule execution starting with
$EV = 0x208c8c0 (class: SAM,   event_handle: 3075, mc_ueid: mc.sim.105763a3.0)
$POL = 0x20c3568 (class: IM_ENRICHMENT_POLICY,   data_handle: 1298, mc_udid: mc.sim.10576339.0)

20120917 195339.828000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 72: refine im_internal_enrichment: SAM #3075: $HLEN = 0xb

20120917 195339.828000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 73: refine im_internal_enrichment: SAM #3075: $ZPOS = 0x4

20120917 195339.828000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 80: refine im_internal_enrichment: SAM #3075: $EV.mc_location = 'sam.com'

20120917 195339.828000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 92: refine im_internal_enrichment: SAM #3075: $MSG = 'This is Windows. Please reassign to Sam TRUONG'

20120917 195339.828000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 93: refine im_internal_enrichment: SAM #3075: $EV.msg = 'This is Windows. Please reassign to Sam TRUONG'

Warning

Note

This policy is actually a refine rule.

In the following example, events are getting deduplicated. By checking the cell trace, the first event is displayed as follows:

20120917 202046.156000 mcell: EVTPROC: BMC-IMC090101V: Received message #1 EVENT [247] :EVENT;
            msg=EVENT1;
            severity=MAJOR;
            mc_host_address=192.168.0.200;
            mc_host=sim.sam.com;
            mc_arrival_time=1347906046;
            mc_incident_report_time=1347906046;
            mc_event_model_version=1.1;
            mc_tool_time=1347906046;
            mc_tool_address=192.168.0.200;
END
20120917 202046.203000 mcell: MESSAGES: BMC-IMC046014V: Send message #16.1 ANSWER [12]
20120917 202046.203000 mcell: SERVICE: BMC-IMC050117V: Sent message #16.1 answering #16.1
20120917 202046.218000 mcell: BAROC: BMC-IMC032262V: Upgrade event #3078 @0xa6c0b8 to context Refine
20120917 202046.218000 mcell: EVTPROC: BMC-IMC090102V: Start refine on event #3078 @0xa6c0b8
20120917 202046.218000 mcell: RULES: BMC-IMC110801I: im_internal.mrl, 15: refine im_internal_lowercase_hostname: EVENT #3078: Rule execution starting with
$EV = 0xa6c0b8 (class: EVENT,   event_handle: 3078, mc_ueid: mc.sim.105769fe.0)
20120917 202046.218000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 18: refine im_internal_lowercase_hostname: EVENT #3078: $EV.mc_host = 'sim.sam.com'
20120917 202046.218000 mcell: RULES: BMC-IMC110801I: im_internal.mrl, 102: refine im_internal_default_location: EVENT #3078: Rule execution starting with
$EV = 0xa6c0b8 (class: EVENT,   event_handle: 3078, mc_ueid: mc.sim.105769fe.0)
20120917 202046.218000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 111: refine im_internal_default_location: EVENT #3078: $HLEN = 0xb
20120917 202046.218000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 112: refine im_internal_default_location: EVENT #3078: $ZPOS = 0x4
20120917 202046.218000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 119: refine im_internal_default_location: EVENT #3078: $EV.mc_location = 'sam.com'
20120917 202046.218000 mcell: RULES: BMC-IMC110801I: mc_sm_associate.mrl, 17: refine initialization_steps: EVENT #3078: Rule execution starting with
$RAW_EVENT = 0xa6c0b8 (class: EVENT,   event_handle: 3078, mc_ueid: mc.sim.105769fe.0)
20120917 202046.218000 mcell: RULES: BMC-IMC110203I: mc_sm_associate.mrl, 20: refine initialization_steps: EVENT #3078: $RAW_EVENT.mc_smc_type = ''
20120917 202046.218000 mcell: RULES: BMC-IMC110801I: ibrsd_bem_categorization.mrl, 6: refine create_index_for_dda_match: EVENT #3078: Rule execution starting with
$NEW = 0xa6c0b8 (class: EVENT,   event_handle: 3078, mc_ueid: mc.sim.105769fe.0)
20120917 202046.218000 mcell: RULES: BMC-IMC110203I: ibrsd_bem_categorization.mrl, 8: refine create_index_for_dda_match: EVENT #3078: $IBRSD_RecordIndex.Index = ':'
20120917 202046.218000 mcell: RULES: BMC-IMC110005V: Stop refine event #3078 @0xa6c0b8 : CONT
20120917 202046.234000 mcell: BAROC: BMC-IMC032262V: Upgrade event #3078 @0xa6c0b8 to context Filter
20120917 202046.234000 mcell: FILTER: BMC-IMC093102V: Event passed filters
20120917 202046.234000 mcell: BAROC: BMC-IMC032262V: Upgrade event #3078 @0xa6c0b8 to context Engine
20120917 202046.234000 mcell: EVTPROC: BMC-IMC090106V: Transfer event #3078 @0xa6c0b8 to Rule Engine
20120917 202046.234000 mcell: RULES: BMC-IMC110007V: Starting transaction on event #3078 @0xa6c0b8
20120917 202046.234000 mcell: RULES: BMC-IMC110008V: Analysing event #3078 @0xa6c0b8
20120917 202046.234000 mcell: RULES: BMC-IMC110801I: im_internal.mrl, 314: new im_internal_recurrence: EVENT #3078: Rule execution starting with
$EV = 0xa6c0b8 (class: EVENT,   event_handle: 3078, mc_ueid: mc.sim.105769fe.0)
$POL = 0x9e0588 (class: IM_RECURRENCE_POLICY,   data_handle: 1299, mc_udid: mc.sim.105769a9.0)
20120917 202046.234000 mcell: RULES: BMC-IMC110801I: im_internal.mrl, 753: correlate im_internal_correlation: EVENT #3078: Rule execution starting with
$EV = 0xa6c0b8 (class: EVENT,   event_handle: 3078, mc_ueid: mc.sim.105769fe.0)
20120917 202046.234000 mcell: RULES: BMC-IMC110801I: im_internal.mrl, 1033: execute im_internal_timeout: EVENT #3078: Rule execution starting with
$EV = 0xa6c0b8 (class: EVENT,   event_handle: 3078, mc_ueid: mc.sim.105769fe.0)
20120917 202046.234000 mcell: RULES: BMC-IMC110806I: im_internal.mrl, 1035: execute im_internal_timeout: EVENT #3078: when block execution starting
20120917 202046.250000 mcell: BAROC: BMC-IMC032262V: Upgrade event #3078 @0xa6c0b8 to context Permanent
20120917 202046.250000 mcell: RULES: BMC-IMC110009V: Terminating transaction on event #3078 @0xa6c0b8
20120917 202046.250000 mcell: COLLECT: BMC-IMC103102V: Event #3078 added to collector Open of set By Status
20120917 202046.250000 mcell: COLLECT: BMC-IMC103102V: Event #3078 added to collector sim of set By Location
20120917 202046.250000 mcell: COLLECT: BMC-IMC103103V: Event #3078 not added to collector of set MC_Related_Events
20120917 202046.250000 mcell: COLLECT: BMC-IMC103103V: Event #3078 not added to collector of set PATROL
20120917 202046.250000 mcell: COLLECT: BMC-IMC103103V: Event #3078 not added to collector of set MC_SMC_Events
20120917 202046.250000 mcell: COLLECT: BMC-IMC103102V: Event #3078 added to collector Open of set All Events
20120917 202046.250000 mcell: COLLECT: BMC-IMC103103V: Event #3078 not added to collector of set Event Incidents By Status
20120917 202046.250000 mcell: COLLECT: BMC-IMC103103V: Event #3078 not added to collector of set Event Incidents By Priority
20120917 202046.250000 mcell: COLLECT: BMC-IMC103103V: Event #3078 not added to collector of set Incident Errors
20120917 202046.250000 mcell: COLLECT: BMC-IMC103103V: Event #3078 not added to collector of set Component Incidents By Status
20120917 202046.250000 mcell: COLLECT: BMC-IMC103103V: Event #3078 not added to collector of set Component Incidents By Priority
20120917 202046.250000 mcell: RULES: BMC-IMC110801I: sam_propagate.mrl, 1: propagate Critical_to_ibrsd: EVENT #3078: Rule execution starting with
$EV = 0xa6c0b8 (class: EVENT,   event_handle: 3078, mc_ueid: mc.sim.105769fe.0)

20120917 202046.250000 mcell: EVTPROC: BMC-IMC090108V: Event #3078 @0xa6c0b8 handled

The above message indicates that the first event has gone through all the rule phases and was not removed or dropped. Now, in the cell trace, we see the second event:

20120917 202059.093000 mcell: EVTPROC: BMC-IMC090101V: Received message #1 EVENT [250] :EVENT;
            msg=EVENT2;
            severity=CRITICAL;
            mc_host_address=192.168.0.200;
            mc_host=sim.sam.com;
            mc_arrival_time=1347906059;
            mc_incident_report_time=1347906059;
            mc_event_model_version=1.1;
            mc_tool_time=1347906059;
            mc_tool_address=192.168.0.200;
END
20120917 202059.140000 mcell: MESSAGES: BMC-IMC046014V: Send message #18.1 ANSWER [12]
20120917 202059.140000 mcell: SERVICE: BMC-IMC050117V: Sent message #18.1 answering #18.1
20120917 202059.140000 mcell: BAROC: BMC-IMC032262V: Upgrade event #3079 @0xa487f8 to context Refine
20120917 202059.140000 mcell: EVTPROC: BMC-IMC090102V: Start refine on event #3079 @0xa487f8
20120917 202059.140000 mcell: RULES: BMC-IMC110801I: im_internal.mrl, 15: refine im_internal_lowercase_hostname: EVENT #3079: Rule execution starting with

$EV = 0xa487f8 (class: EVENT,   event_handle: 3079, mc_ueid: mc.sim.10576a0b.0)

20120917 202059.140000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 18: refine im_internal_lowercase_hostname: EVENT #3079: $EV.mc_host = 'sim.sam.com'
20120917 202059.140000 mcell: RULES: BMC-IMC110801I: im_internal.mrl, 102: refine im_internal_default_location: EVENT #3079: Rule execution starting with
$EV = 0xa487f8 (class: EVENT,   event_handle: 3079, mc_ueid: mc.sim.10576a0b.0)
20120917 202059.140000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 111: refine im_internal_default_location: EVENT #3079: $HLEN = 0xb
20120917 202059.140000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 112: refine im_internal_default_location: EVENT #3079: $ZPOS = 0x4
20120917 202059.140000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 119: refine im_internal_default_location: EVENT #3079: $EV.mc_location = 'sam.com'
20120917 202059.156000 mcell: RULES: BMC-IMC110801I: mc_sm_associate.mrl, 17: refine initialization_steps: EVENT #3079: Rule execution starting with
$RAW_EVENT = 0xa487f8 (class: EVENT,   event_handle: 3079, mc_ueid: mc.sim.10576a0b.0)
20120917 202059.156000 mcell: RULES: BMC-IMC110203I: mc_sm_associate.mrl, 20: refine initialization_steps: EVENT #3079: $RAW_EVENT.mc_smc_type = ''
20120917 202059.156000 mcell: RULES: BMC-IMC110801I: ibrsd_bem_categorization.mrl, 6: refine create_index_for_dda_match: EVENT #3079: Rule execution starting with
$NEW = 0xa487f8 (class: EVENT,   event_handle: 3079, mc_ueid: mc.sim.10576a0b.0)
20120917 202059.156000 mcell: RULES: BMC-IMC110203I: ibrsd_bem_categorization.mrl, 8: refine create_index_for_dda_match: EVENT #3079: $IBRSD_RecordIndex.Index = ':'
20120917 202059.156000 mcell: RULES: BMC-IMC110005V: Stop refine event #3079 @0xa487f8 : CONT
20120917 202059.156000 mcell: BAROC: BMC-IMC032262V: Upgrade event #3079 @0xa487f8 to context Filter
20120917 202059.156000 mcell: FILTER: BMC-IMC093102V: Event passed filters
20120917 202059.171000 mcell: BAROC: BMC-IMC032262V: Upgrade event #3079 @0xa487f8 to context Engine
20120917 202059.171000 mcell: EVTPROC: BMC-IMC090106V: Transfer event #3079 @0xa487f8 to Rule Engine
20120917 202059.171000 mcell: RULES: BMC-IMC110007V: Starting transaction on event #3079 @0xa487f8
20120917 202059.171000 mcell: RULES: BMC-IMC110008V: Analysing event #3079 @0xa487f8

20120917 202059.171000 mcell: RULES: BMC-IMC110801I: im_internal.mrl, 314: new im_internal_recurrence: EVENT #3079: Rule execution starting with

$EV = 0xa487f8 (class: EVENT,   event_handle: 3079, mc_ueid: mc.sim.10576a0b.0)

$POL = 0x9e0588 (class: IM_RECURRENCE_POLICY,   data_handle: 1299, mc_udid: mc.sim.105769a9.0)

20120917 202059.171000 mcell: RULES: BMC-IMC110202I: im_internal.mrl, 321: new im_internal_recurrence: EVENT #3079: solution 1 to event query:

$OLD_EV = 0xa6c0b8 (class: EVENT,   event_handle: 3078, mc_ueid: mc.sim.105769fe.0

Here, the IM_RECURRENCE_POLICY has been ran against the second event (event_handle: 3079) and it actually updated the slots of the first event (event_handle: 3078), referred to as $OLD_EV.<slot>

20120917 202059.171000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 324: new im_internal_recurrence: EVENT #3079: $OLD_EV.repeat_count = 0x1

20120917 202059.171000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 325: new im_internal_recurrence: EVENT #3079: $OLD_EV.mc_timeout = 0x0

20120917 202059.171000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 332: new im_internal_recurrence: EVENT #3079: $OLD_EV.severity = CRITICAL

20120917 202059.171000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 356: new im_internal_recurrence: EVENT #3079: $OLD_EV.msg = EVENT2

20120917 202059.187000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 360: new im_internal_recurrence: EVENT #3079: $OLD_EV.mc_parameter = ''

20120917 202059.187000 mcell: RULES: BMC-IMC110203I: im_internal.mrl, 361: new im_internal_recurrence: EVENT #3079: $OLD_EV.mc_parameter_value = ''

20120917 202059.187000 mcell: EVTPROC: BMC-IMC090116V: Modified slots of event #3078 @0xa6c0b8 to Rule Engine

And then, the last event (#3079) was removed:

20120917 202059.187000 mcell: EVTPROC: BMC-IMC090110V: Removed event #3079 @0xa487f8

20120917 202059.187000 mcell: RULES: BMC-IMC110009V: Terminating transaction on event #3079 @0xa487f8

Following is the recurrence policy:

IM_RECURRENCE_POLICY;
        data_handle=1299;
        mc_udid='mc.sim.105769a9.0';
        mc_creation_time=1347905961;
        mc_modification_time=1347905961;
        mc_modification_request_time=1347905961;
        mc_modification_requestor=struong;
        publish_env_id='';
        name=Recurrence_AllEvents;
        description=Recurrence_AllEvents;
        enabled=1;
        active_timeframes=[];
        except_timeframes=[];
        active_global_timeframes=[];
        except_global_timeframes=[];
        selector_name=All_Events;
        selector_class=EVENT;
        selector_ecf='*All_Events ($EV)';
        ordinal=0;
        during_schedule=YES;
        time_window=0;
        time_scale=SECONDS;
        update_severity=1;
        keep_highest_sev=1;
        update_priority=1;
        keep_highest_pri=1;
        update_msg=1;
        update_parameter=1;
END

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*

BMC ProactiveNet 9.6