IBM OMEGAMON for z/OS
5.5
Monitoring z/OS
IBM
SC27-4028-02
Note
Before using this information and the product it supports, read the information in “Notices” on page
83.
Edition notice
This edition applies to version 5, release 5, modification 0 of IBM OMEGAMON for z/OS (product number 5698-T01) and
to all subsequent releases and modifications until otherwise indicated in new editions.
© Copyright International Business Machines Corporation 2004, 2022.
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with
IBM Corp.
Contents
Figures.................................................................................................................. v
Tables................................................................................................................. vii
Chapter 1. Using and customizing OMEGAMON for z/OS..........................................1
Using the predefined workspaces............................................................................................................... 1
Sysplex and system managed system names....................................................................................... 2
Organization of the workspaces............................................................................................................. 2
Organization of the Sysplex workspaces............................................................................................... 4
Organization of system-level predefined workspaces.......................................................................... 7
Prerequisites for data reporting...........................................................................................................11
Customizing workspaces......................................................................................................................12
Using cross-product workspace links..................................................................................................12
Finding more information about workspaces......................................................................................13
Using the predefined situations................................................................................................................ 13
Activating situations.............................................................................................................................13
Modifying situations............................................................................................................................. 14
Migrating situations.............................................................................................................................. 15
Recreating or replacing z/OS Management Console situations.......................................................... 15
Finding more information about situations......................................................................................... 16
Using historical data collection and reporting.......................................................................................... 16
RMF near-term history data................................................................................................................. 16
Historical data stored by IBM Tivoli Monitoring.................................................................................. 16
Using Take Action commands................................................................................................................... 19
Issuing OMEGAMON for z/OS commands........................................................................................... 19
Issuing UNIX commands..................................................................................................................... 20
Memory list and memory zap...............................................................................................................22
Chapter 2. Using the Inspect function...................................................................27
Invoking the Inspect function................................................................................................................... 27
Modifying the template Inspect link .........................................................................................................27
The Inspect Address Space CPU Use workspace..................................................................................... 31
Understanding the Inspect data–an example.......................................................................................... 32
Chapter 3. Using dynamic links to OMEGAMON for MVS.........................................35
How dynamic linking works....................................................................................................................... 35
Product-provided links.............................................................................................................................. 36
Creating new dynamic links.......................................................................................................................36
Create the target workspace................................................................................................................37
Modify the associated script................................................................................................................ 38
Define the dynamic link........................................................................................................................40
Assigning the Dynamic Link to 3270 query......................................................................................... 43
Chapter 4. Monitoring scenarios........................................................................... 45
Monitoring shared DASD............................................................................................................................45
Filtering collection of DASD device data..............................................................................................45
Using shared DASD data collection to identify the cause of I/O delays ............................................ 49
Monitoring virtual storage and missing jobs............................................................................................. 50
Monitoring paging and virtual storage................................................................................................. 50
iii
Monitoring critical started tasks.......................................................................................................... 52
Monitoring service class goals...................................................................................................................54
The scenario......................................................................................................................................... 54
Creating the zOS_Critical_SvcClass_Missed_Goal situation............................................................... 55
Defining the Take Action command..................................................................................................... 58
Setting thresholds in the WLM Service Class Resources workspace..................................................60
Analyzing a problem............................................................................................................................. 63
Monitoring cryptographic coprocessors....................................................................................................64
Validating your cryptography configuration.........................................................................................64
Monitoring and improving cryptography performance........................................................................65
Monitoring and improving cross-system ICSF performance...............................................................68
Detecting CPU looping address spaces.....................................................................................................68
Determining the intent of a job............................................................................................................ 69
The KM5_CPU_Loop_Warn situation................................................................................................... 69
Investigating and identifying looping jobs...........................................................................................70
Support information............................................................................................ 81
iv
Figures
1. History Collection Configuration dialog box............................................................................................... 18
2. Create Situations dialog.............................................................................................................................. 47
3. The Formula tab for a filter situation.......................................................................................................... 47
4. Sample jobs ................................................................................................................................................ 71
5. Event flyover for KM5_CPU_Loop Warn......................................................................................................72
6. Situation event workspace for KM5_CPU_Loop_Warn............................................................................... 73
7. Address Space Overview workspace.......................................................................................................... 73
8. Address Space Bottleneck Summary workspace.......................................................................................74
9. Examine Details for Job BKEALCP3 screen space..................................................................................... 75
10. Inspect results for BKEALCP3.................................................................................................................. 75
11. Drill down of INSPECT data...................................................................................................................... 76
12. Further drill down on INSPECT data.........................................................................................................77
13. OMEGAMON main menu........................................................................................................................... 77
14. OMEGAMON Action Commands screen................................................................................................... 78
15. OPS CMDS screen space...........................................................................................................................79
16. Address Space Bottlenecks Summary workspace without BKEALCP3...................................................80
v
vi
Tables
1. Prerequisites for data display..................................................................................................................... 11
2. Creating OMEGAMON for z/OS situations...................................................................................................15
3. OMEGAMON z/OS Management Console and OMEGAMON for z/OS equivalent situations..................... 15
4. Product-provided dynamic links................................................................................................................. 36
vii
viii
Chapter 1. Using and customizing OMEGAMON for
z/OS
Learn about the monitoring resources provided by OMEGAMON for z/OS and how to use them to meet
your specific requirements.
The following topics are covered:
• Using and customizing predefined workspaces that report Sysplex- and system-level data.
• Activating and customizing predefined situations to enable alerts and reflex actions.
• Configuring historical data collection and reporting, using the Tivoli Enterprise Portal, or the
OMEGAMON Enhanced 3270 user interface.
• Issuing UNIX commands using the Take Action feature.
• Finding more detailed information on workspaces, attributes, and situations.
The information on the organization and use of workspaces is useful for all OMEGAMON for z/OS users.
The information on activating and modifying situations and configuring historical data collection is more
useful for users with administrative authorities who are responsible for setting up and customizing
monitoring and alerts.
Using the predefined workspaces
OMEGAMON for z/OS includes two sets of predefined workspaces for the Tivoli Enterprise Portal:
Sysplex-level workspaces and system- (or LPAR-) level workspaces. In the Tivoli Enterprise Portal, these
workspaces are accessed either directly from the physical Navigator view or through links from other
workspaces.
In the Navigator, the Sysplex workspaces are listed under each managed Sysplex name, and the system
workspaces are listed under each z/OS® managed system name. In a multiplex enterprise, the Navigator
looks something like the following:
Enterprise
Windows Systems
z/OS Systems
SYSPLEX1:MVS:SYSPLEX
Coupling Facility Policy Data for Sysplex
Coupling Facility Structures Data for Sysplex
...
Sys1
MVS Operating System
DEMOPLX:SYS1:MVSSYS
Address Space Overview
Channel Path Activity
Sys2
SYSPLEX2:MVS:SYSPLEX
© Copyright IBM Corp. 2004, 2022 1
Most of the predefined workspaces are capable of reporting historical data. However, you must configure
and start historical data collection in order for historical data to be available for reporting (see “Using
historical data collection and reporting” on page 16).
Sysplex and system managed system names
From the standpoint of OMEGAMON for z/OS, Sysplexes and systems are managed systems. In the Tivoli
Enterprise Portal Navigator, managed systems are identified by managed system names.
Sysplex managed system names take the form:
plexname:MVS:SYSPLEX
where plexname is typically the true name of the Sysplex, but might be configured to be an alias for the
Sysplex.
System managed system names take the form:
plexname:smfid:MVSSYS
where plexname is typically the true name of the Sysplex, but could be configured to be an alias for the
Sysplex. (This part of the system managed system name typically matches the plexname component of
its parent Sysplex in the navigation tree.) The smfid component is the true System Management Facility
(SMF) ID for the system or LPAR being monitored.
Organization of the workspaces
The physical Navigator view of the Tivoli Enterprise Portal shows an enterprise as a mapping of platforms,
computers, agents, and monitored resources. Each Navigator item can be associated with one or more
workspaces that provide information relevant to that level of the Navigator.
In a Sysplex environment, monitored Sysplexes appear between the platform and system levels of the
Navigator tree, listed by their managed system names:
Enterprise
Windows Systems
z/OS Systems
SYSPLEX1:MVS:SYSPLEX
Coupling Facility Policy Data for Sysplex
Coupling Facility Structures Data for Sysplex
...
Sys1
Sys2
SYSPLEX2:MVS:SYSPLEX
If OMEGAMON for z/OS is installed, the Sysplex Enterprise Overview is the default workspace for the
z/OS Systems item in the Tivoli Enterprise Portal Navigator. As its name indicates, this workspace
provides an overview of all the Sysplexes in your enterprise. From the table views in this workspace you
can link to Sysplex-level workspaces for a selected Sysplex. From the workspace or the z/OS Systems
Navigator item, you can also access the Cross-System Cryptographic Coprocessor Overview workspace.
To access the workspace from the Navigator, select the z/OS Systems item, then right-click it and select
Workspace from the popup menu.
2 IBM OMEGAMON for z/OS: Monitoring z/OS
Each Sysplex item in the Navigator is associated with a Sysplex Level Overview workspace, which
provides summary data for the selected Sysplex and links to more detailed workspaces. Below each
Sysplex item are items for the Sysplex-level resources and components:
SYSPLEX1:MVS:SYSPLEX
Coupling Facility Policy Data for Sysplex
Coupling Facility Structures Data for Sysplex
...
Sys1
Sys2
SYSPLEX2:MVS:SYSPLEX
Each Sysplex item is associated with one or more workspaces that report information on resources
shared by the Sysplex or on Sysplex workloads. Each of these items has a default workspace, which
opens when you select the item. This workspace may have may links to other related workspaces. For
example, the default workspace for the Shared DASD Groups Data for Sysplex entry is the Shared DASD
for Groups workspace, which displays information for all the groups in the Sysplex. From this workspace
you can link to a workspace that displays details for a selected group of DASD devices.
Under the Sysplex items in the Navigator are items for every system (or LPAR) in the Sysplex that is
being monitored by an OMEGAMON monitoring agent. Each system has a default System Level Overview
workspace. Under each system item in the Navigator tree is an item for each type of resource
being monitored by an OMEGAMON agent. For example, if you have installed both OMEGAMON for z/OS
and IBM OMEGAMON® for Storage on z/OS, you will see entries for MVS Operating System and Storage
Subsystem.
Sys1
MVS Operating System
Storage subsystem
If you expand MVS Operating System item, you see the managed system name of the system or LPAR
monitored by the OMEGAMON for z/OS monitoring agent:
Sys1
MVS Operating System
DEMOPLX:SYS1:MVSSYS
If you expand managed system entry, the workspaces that provide information about that system are
listed:
DEMOPLX:SYS1:MVSSYS
Address Space Overview
Channel Path Activity
Common Storage
Chapter 1. Using and customizing OMEGAMON for z/OS 3
As with the Sysplex-level entries, every entry under the system name is associated with one or more
workspaces. Each entry has a default workspace, which opens when you select the entry, and which may
have other related workspaces you can access through links in table views in the workspace.
For more information about the organization of the product-provided workspace and the attribute groups
associated with them, see “Organization of the Sysplex workspaces” on page 4 and “Organization of
system-level predefined workspaces” on page 7.
Organization of the Sysplex workspaces
Lists the Sysplex-level workspaces available from the Navigator and the workspaces linked from the
primary and secondary workspaces
Primary workspaces can be accessed directly from the Navigator. Other, secondary, workspaces can be
accessed only from another workspace.
The following table shows the organization of the Sysplex-level predefined workspaces, beginning with
the Sysplex Enterprise Overview workspace. From the Sysplex Enterprise Overview workspace you can
link to all of the Sysplex-level workspaces listed in the Navigator. The other primary workspaces are
listed in alphabetical order. All other workspaces are shown nested beneath the primary or secondary
workspaces from which they can be linked.
• Sysplex Level Overview
• Coupling Facility Policy Data for Sysplex workspace
• Coupling Facility Structures Data for Sysplex workspace
– MVS Systems Workspace for CF Structure
– Statistics for CF List or Lock Structures workspace
– Statistics for CF Cache Structure workspace
– Users Workspace for CF Structure
• Coupling Facility Systems Data for Sysplex workspace
– Paths Workspace for a CF System
- Path Details Workspace for a CF System
– Statistics Workspace for CF System
• Enterprise Central Processing Complex Overview workspace
– Enterprise CPC Details workspace
- Address Spaces CPU in an LPAR workspace
• Historical Common Storage Utilization workspace
• Historical CPU Utilization for an Address Space workspace
• Historical Delays for Address Space workspace
• Historical Memory Object Utilization workspace
• Historical Real Storage Utilization workspace
- Address Spaces Common Storage in an LPAR workspace
• Historical Common Storage Utilization workspace
• Historical CPU Utilization for an Address Space workspace
• Historical Delays for Address Space workspace
• Historical Memory Object Utilization workspace
• Historical Real Storage Utilization workspace
- Address Spaces Memory Objects in an LPAR workspace
• Historical Common Storage Utilization workspace
4 IBM OMEGAMON for z/OS: Monitoring z/OS
• Historical CPU Utilization for an Address Space workspace
• Historical Delays for Address Space workspace
• Historical Memory Object Utilization workspace
• Historical Real Storage Utilization workspace
- Address Spaces Real Storage in an LPAR workspace
• Historical Common Storage Utilization workspace
• Historical CPU Utilization for an Address Space workspace
• Historical Delays for Address Space workspace
• Historical Memory Object Utilization workspace
• Historical Real Storage Utilization workspace
- Historical Summary for a CPC workspace
- Historical Summary for an LPAR workspace
- Historical EADM Device Activity workspace
• EADM Device/Subchannel Summary workspace
• On-Chip Compression Summary workspace
• SCM Card Summary for VFM workspace
- Historical System Memory Objects Summary workspace
- Historical System Real Storage Summary workspace
• Enterprise CPC Details workspace
– Address Spaces CPU in an LPAR workspace
- Historical Common Storage Utilization workspace
- Historical CPU Utilization for an Address Space workspace
- Historical Delays for Address Space workspace
- Historical Memory Object Utilization workspace
- Historical Real Storage Utilization workspace
– Address Spaces Common Storage in an LPAR workspace
- Historical Common Storage Utilization workspace
- Historical CPU Utilization for an Address Space workspace
- Historical Delays for Address Space workspace
- Historical Memory Object Utilization workspace
- Historical Real Storage Utilization workspace
– Address Spaces Memory Objects in an LPAR workspace
- Historical Common Storage Utilization workspace
- Historical CPU Utilization for an Address Space workspace
- Historical Delays for Address Space workspace
- Historical Memory Object Utilization workspace
- Historical Real Storage Utilization workspace
– Address Spaces Real Storage in an LPAR workspace
- Historical Common Storage Utilization workspace
- Historical CPU Utilization for an Address Space workspace
- Historical Delays for Address Space workspace
- Historical Memory Object Utilization workspace
Chapter 1. Using and customizing OMEGAMON for z/OS 5
- Historical Real Storage Utilization workspace
– Historical Summary for a CPC workspace
– Historical Summary for an LPAR workspace
– Historical EADM Device Activity workspace
- EADM Device/Subchannel Summary workspace
- On-Chip Compression Summary workspace
- SCM Card Summary for VFM workspace
– Historical System Memory Objects Summary workspace
– Historical System Real Storage Summary workspace
• Global Enqueue Data for Sysplex workspace
– Global Enqueue and Reserve workspace
• GRS Ring Systems Data for Sysplex workspace
• Report Classes Data for Sysplex workspace
– Address Spaces Workspace for Report Classes
• Report Classes Enterprise Extended Metrics workspace
– Report Classes Sysplex Extended Metrics workspace
- Report Class Extended Metrics workspace
• Report Class on System Extended Metrics workspace
- Report Class on System Extended Metrics workspace
– Report Class Extended Metrics workspace
- Report Class on System Extended Metrics workspace
– Report Class on System Extended Metrics workspace
• Resource Groups Data for Sysplex workspace
– Service Classes Workspace for Resource Group
– Resource Groups Extended Metrics workspace
- WLM Resource Groups workspace
– Tenant Resource Groups for a Sysplex workspace
- Tenant Resource Group Systems workspace
• Solution Billing for Tenant on System workspace
– Report Class Periods for a Tenant Resource Group workspace
– Tenant Resource Groups in a Solution ID workspace
- Report Class on System Extended Metrics workspace
• Service Classes Data for Sysplex workspace
– Workflow Analysis Workspace for Service Class
- Workflow Analysis Enqueue Workspace for Service Class Sysplex
- Workflow Analysis I/O Workspace for Service Class
– Periods Workspace for Service Class
- Workflow Analysis Workspace for Service Class Period
• Workflow Analysis Enqueue Workspace for Service Class Period
• Workflow Analysis I/O Workspace for Service Class Period
- Address Spaces Workspace for Service Class Period
6 IBM OMEGAMON for z/OS: Monitoring z/OS
– Address Spaces Workspace for Service Class
– Systems Workspace for Service Class
- Workflow Analysis Workspace for Service Class System
• Workflow Analysis Enqueue Workspace for Service Class System
• Workflow Analysis I/O Workspace for Service Class System
- Periods Workspace for Service Class System
• Workflow Analysis Workspace for Service Class Period System
– Workflow Analysis Enqueue Workspace for Service Class Period System
– Workflow Analysis I/O Workspace for Service Class Period System
– Subsystem Workflow Analysis for Service Class workspace
• Service Classes Enterprise Extended Metrics workspace
– Service Classes Sysplex Extended Metrics workspace
- Service Class Extended Metrics workspace
• Service Class on System Extended Metrics workspace
- Service Class on System Extended Metrics workspace
– Service Class Extended Metrics workspace
- Service Class on System Extended Metrics workspace
– Service Class on System Extended Metrics workspace
• Service Definition Data for Sysplex workspace
• Shared DASD Groups Data for Sysplex workspace
– Shared DASD Devices workspace
- Shared DASD Systems workspace
• Sysplex Users of a Device workspace
• XCF Systems Data for Sysplex workspace
– XCF System Statistics workspace
– XCF Paths Workspace from System Device To
• XCF Groups Data for Sysplex workspace
– Members Workspace for XCF Group
• XCF Paths Data for Sysplex workspace
– XCF Paths Workspace from System Device To
Organization of system-level predefined workspaces
OMEGAMON for z/OS provides a set of predefined workspaces that appear in the Tivoli Enterprise
PortalNavigator in the Physical view.
Primary workspaces are accessible directly from the Navigator. The hierarchical tree that follows this
paragraph lists these workspaces in alphabetical order. Some workspaces are accessible only as links
from other workspaces. These are called either secondary or subsidiary workspaces and are shown
nested beneath their parent, primary workspace. For example, if you right-click on either a row or the
link icon in a table view in the Address Space Overview workspace and choose Link To, you can link to
the Address Space Bottlenecks and Impact Analysis workspace. Some workspaces can be accessed from
multiple places
• Address Space Overview workspace
– Address Space Bottlenecks and Impact Analysis workspace
Chapter 1. Using and customizing OMEGAMON for z/OS 7
– Address Space Bottlenecks Detail workspace
– Address Space Bottlenecks Summary workspace
Address Space Bottlenecks and Impact Analysis workspace
Address Space Bottlenecks Detail workspace
OMEGAMON for MVS - Job Details workspace
– Address Space Common Storage - Active Users workspace
- Address Space Common Storage - Trend Details workspace
- Address Space Common Storage - Allocation Details workspace
- Address Space Common Storage - Orphaned Elements workspace
OMEGAMON for MVS - CSA Analyzer workspace
– Address Space CPU Usage Details workspace
– Address Space CPU Usage Enclaves workspace
– Address Space CPU Utilization workspace
Address Space CPU Usage Details workspace
Address Space CPU Usage Enclaves workspace
Inspect Address Space CPU Use workspace
Enclaves Owned by Selected Address Space workspace
OMEGAMON for MVS - Job Details workspace
WLM Service Class Information for Selected Address Space workspace
– Address Space Details for Job workspace
- Inspect Address Space CPU Use workspace
- Address Space CPU Usage Enclaves workspace
- Address Space Bottlenecks and Impact Analysis workspace
- Dubbed Address Spaces workspace
- Address Space Common Storage - Allocation Details workspace
- Global Enqueue and Reserve workspace
- WLM Service Class Information for Selected Address Space workspace
– Address Spaces in an LPAR workspace
- Historical Common Storage Utilization workspace
- Historical CPU Utilization for an Address Space workspace
- Historical Delays for Address Space workspace
- Historical Memory Object Utilization workspace
- Historical Real Storage Utilization workspace
– Address Space Storage workspace
OMEGAMON for MVS - Job Details workspace
Address Space Storage – Subpools and LSQA workspace
– Address Space Storage – Subpools and LSQA workspace
- Address Space Storage – Subpools and LSQA: Monitored Address Spaces workspace
– Address Space Storage for Job workspace
– Enclaves Owned by Selected Address Space workspace
– Job Device Allocations workspace
- Users of a Dataset workspace
- Users of a Device workspace
8 IBM OMEGAMON for z/OS: Monitoring z/OS
– OMEGAMON for MVS - Job Details workspace
– WLM Service Class Information for Selected Address Space workspace
DASD MVS Devices workspace
Enqueue, Reserve, and Lock Summary workspace
• Channel Path Activity workspace
• Common Storage workspace
– Common Storage - Subpools workspace
• Cryptographic Services workspace
Service Call Performance workspace
Top Users Performance workspace
Cross-system Cryptographic Coprocessor Overview workspace
• DASD MVS workspace
• DASD MVS Devices workspace
• Enclave Information workspace
Enclave Details workspace
Address Space Owning Selected Enclave workspace
WLM Service Class Resources workspace
• Enqueue, Reserve, and Lock Summary workspace
Enqueue and Reserve Detail workspace
• Health Check Status workspace
• Health Checks workspace
– Health Check Messages workspace
• JES Subsystem Spool Utilization workspace
• LPAR Clusters workspace
Historical Summary for a CPC workspace
Historical Summary for an LPAR workspace
LPARs Assigned to a Cluster workspace
OMEGAMON for MVS – LPAR PR/SM Processor Statistics workspace
• Operator Alerts workspace
• Page Dataset Activity workspace
• Real Storage workspace
– Storage Shortage Alerts workspace
- Storage Shortage Alerts Details workspace
• Address Space Details for Job workspace
- Storage Shortage Alerts Trends workspace
- Page Dataset Activity workspace
• System CPU Utilization workspace
– HiperDispatch Details workspace
– Multi-Threading Details workspace
– Warning Track Interrupt Details workspace
– 4-Hour Rolling Average MSU Statistics workspace
• System Paging Activity workspace
– Flash Device Activity Details workspace
Chapter 1. Using and customizing OMEGAMON for z/OS 9
– Flash Memory Activity Details workspace
• Tape Drives workspace
• User Response Time workspace
• WLM Service Class Resources workspace
Address Space Bottlenecks in Service Class Period workspace
Enclaves in Selected Service Class and Period workspace
Address Space CPU Usage Class and Period workspace
Service Class on System Extended Metrics workspace
• z/OS System Overview workspace
– Address Space Common Storage - Active Users workspace
– Address Space CPU Utilization workspace
– Common Storage workspace
– Enqueue, Reserve, and Lock Summary workspace
– System CPU Utilization workspace
- HiperDispatch Details workspace
- Multi-Threading Details workspace
- Warning Track Interrupt Details workspace
• z/OS UNIX System Services Overview workspace
– Address Space CPU Usage Details workspace (filtered on the OMVS address space name)
– Address Space CPU Usage Details workspace (filtered on the zFS address space)
– Dubbed Address Spaces workspace (filtered for OMVS address space name)
UNIX Processes workspace (for processes in the address space)
– Dubbed Address Spaces workspace (filtered for zFS address space name)
UNIX Processes workspace (for processes in the address space)
– UNIX BPXPRMxx Values workspace
– UNIX Files workspace
UNIX Files workspace (for a selected subdirectory)
UNIX Mounted File Systems workspace (for the mounted file system containing the directory or
file)
– UNIX Kernel workspace
– UNIX Logged-on Users workspace
UNIX Processes workspace (for UID and PID)
– UNIX Mounted File Systems workspace
- UNIX Files workspace (for the mount point)
- UNIX HFS ENQ Contention workspace (for SYSDSN and SYSZDSN Enqueues)
- UNIX Processes workspace (for processes using a selected mounted file system)
UNIX Mounted File Systems workspace (for a selected process)
– UNIX Processes workspace
Dubbed Address Spaces workspace (for address spaces containing processes)
UNIX Mounted File Systems workspace (for files systems in use by a process
UNIX Threads workspace (for threads in a process)
UNIX Processes workspace (for child processes)
– UNIX Threads workspace
10 IBM OMEGAMON for z/OS: Monitoring z/OS
– zFS Overview workspace
zFS User Cache workspace
Prerequisites for data reporting
Some workspaces or attributes display data only if specific conditions are met.
See Table 1 on page 11 for a list of these workspaces and conditions.
Table 1. Prerequisites for data display
Data is available in Only if
Common storage workspaces The Common Storage Area Analyzer (CSA Analyzer) is started.
Note: The CSA Analyzer is shipped and installed with
OMEGAMON for z/OS. It is started as a separate started task.
Channel Path Activity workspace The Resource Measurement Facility (RMF) has been started.
GRS Ring Systems Data for Sysplex The global resource serialization (GRS) complex is in ring
workspace mode. (If the complex is in star mode, the workspace shows
only the name, status, and ring acceleration of each system.)
Cryptographic workspaces At least one IBM® cryptographic coprocessor is installed and
configured.
DASD MVS™ Workspace and DASD MVS The Resource Measurement Facility (RMF) is started.
Devices Workspace
Dynamic I/O device information OMEGAMON Subsystem is running.
Tape Drives workspace OMEGAMON Subsystem is running.
4 Hour MSUs attribute in the System A defined capacity is used as a basis for pricing and the z/OS
CPU Utilization workspace system is not running as a guest on z/VM®.
Workspaces showing zIIP and zAAP Either:
(IFA) data
• System z® Application Assist Processors (zAAP) and
Integrated Information Processors (zIIP) are configured on
the systems , or:
• the PROJECTCPU control in the SYS1.PARMLIB IEAOPTxx
member is specified as YES.
LPAR cluster workspaces The z/OS system is not running as a guest on z/VM.
z/OS UNIX System Services attributes The address space where the OMEGAMON for z/OS product is
running has SUPER USER authority. This level of authority is
equivalent to root (UID=0).
Coupling facility, cross-system coupling You have enabled use of RMF data collection as described in
facility, and system lock data collected Enable RMF data collection, and RMF has been started.
by the Resource Measurement Facility
Note: If you enable RMF data collection, the Paths Details
(RMF) Distributed Data Server
Workspace (Enhanced 3270 user interface only) is based on
new RMF-supplied data.
Near-term history attribute groups and You have enabled use of RMF data collection as described in
workspaces Enable RMF data collection, and RMF has been started.
Chapter 1. Using and customizing OMEGAMON for z/OS 11
Customizing workspaces
You can change the workspaces that are supplied with IBM OMEGAMON for z/OS, or use them as models
for creating your own workspaces.
You can:
• add, delete, or modify views
• change queries
• apply thresholds or filters
• change the appearance of tables and charts
• add links to other workspaces, or make a workspace accessible by using a URL.
To change a workspace, create a copy of it; save it with a different name; and then change the copy. If
you keep the original name, your customized workspace will be overwritten the next time you update IBM
OMEGAMON for z/OS.
Tip: To change workspaces, you must have Modify Workspace authority.
If you have OMEGAMON DE on z/OS, you can create workspaces that include both mainframe and
distributed sites, applications, and business processes.
Using cross-product workspace links
Dynamic workspace linking allows you to easily navigate between workspaces that are provided
by multiple products. This feature aids problem determination and improves integration across the
monitoring products, allowing you to quickly determine the root cause of a problem. Predefined cross-
product links provided by the OMEGAMON products allow you to obtain additional information about
systems, subsystems, resources or network components that are being monitored by other monitoring
agents.
When you right-click on a link, a list of links is displayed. This list may contain links to workspaces
provided by one or more monitoring products. The product you are linking to must be installed and
configured and your Tivoli® Enterprise Portal user ID must be authorized to access the target product in
order for the link to that product's workspace to be included in the list.
Choose a workspace from the list to navigate to that workspace. You will link to the target workspace in
context, meaning that you will receive additional information that is related to the system, subsystem or
resource you are currently viewing.
If you choose a workspace from the list and the target workspace is not available, you will receive
message KFWITM081E. For more information, see Cross-product links do not function: message
KFWITM081E.
Predefined links in the Address Space Overview workspace for each managed system allow you to link
to the IBM OMEGAMON for CICS on z/OS Region Overview workspace and the IBM OMEGAMON for
Networks on z/OS Applications Connections workspace to obtain further information related to a selected
address space.
You can also link to the OMEGAMON for CICS Region Overview workspace from the Address Space CPU
Utilization workspace.
Dynamic workspace links in a mixed environment
If you are staging an upgrade, you may have several versions of monitoring agents installed in your
environment. For example, you may have an OMEGAMON for z/OS V5.1 monitoring agent and an
OMEGAMON for z/OS V5.3 monitoring agent running on the same z/OS system during the migration
period.
In this migration scenario, dynamic workspace linking from an OMEGAMON for z/OS V5.3 workspace to an
OMEGAMON for z/OS V5.1 product workspace will work as long as the target workspace exists in the V5.1
product. If the target workspace does not exist, you will receive message KFWITM081E.
12 IBM OMEGAMON for z/OS: Monitoring z/OS
In cases where the V5.3 version of the target workspace has been modified (for example to accept link
parameters to limit the data displayed) you may notice different behavior when you upgrade the target
product.
Finding more information about workspaces
The predefined Sysplex workspaces help you to determine if there is a problem in a Sysplex and, if so,
which component of the Sysplex is affected.
System workspaces provide information about the z/OS operating systems.
For more information about customizing workspaces, see IBM Tivoli Monitoring: Administrator's Guide and
the Tivoli Enterprise Portal online Help.
Using the predefined situations
To help you begin monitoring quickly, OMEGAMON for z/OS provides a number of predefined situations.
These situations monitor for conditions that are typically considered to be problematic or noteworthy and
trigger event indicators in the Navigator when those conditions occur.
You must distribute and start the predefined situations before they can begin monitoring. You should
evaluate each situation carefully and adjust its thresholds before you start monitoring.
Activating situations
To activate a situation, you distribute (assign) the situation to one or more managed systems or managed
system lists, and then start the situation.
You can do this by using either the Situation editor in the Tivoli Enterprise Portal, or the Edit menu in the
Enhanced 3270 user interface.
Each predefined situation is already associated with an appropriate Navigator item. After you distribute a
situation, you will see its name listed under the name of its associated item.
Some system-level situations are shipped with very high or very low values, which essentially disable
them. Others have values that may be inconsistent with the policies, goals, or monitoring requirements of
your site. Examine the predefined situations and customize them with values that are meaningful for your
installation before you activate them.
Distributing situations
Distributing situations assigns them to the system or systems you want them to monitor. Distribute only
the situations that you are going to set to autostart or plan to manually enable. If you distribute all the
situations, they will be propagated to the agents when the Tivoli Enterprise Monitoring Server starts. This
may simplify any subsequent activation procedures, but it extends startup time. Review the situations
to determine which ones you plan to use and add distribution lists for only those situations. Once the
situations are distributed, their alerts will appear on the Navigator items they are associated with.
You distribute situations by using the Tivoli Enterprise Portal or the Enhanced 3270 user interface.
To distribute a situation by using the Tivoli Enterprise Portal:
1. Open the Situation editor.
You can access the Situation editor from the toolbar or by right-clicking an item in the Navigator and
selecting Situations... from the popup menu.
2. If necessary, use Set Situation filter criteria to view the situations available for distribution.
Check Eligible for Association to see a list of all the situations which are written for this type of
managed system (MVS Sysplex or MVS System, depending on where you access the Situation editor
from; if you access the editor from the toolbar, you see situations for all types of managed systems).
Chapter 1. Using and customizing OMEGAMON for z/OS 13
Any undistributed situations show their icon partially dimmed .
3. Select (click) the situation you want to distribute.
The Situation editor displays the Formula tab for the situation.
4. Select the Distribution tab.
The available managed systems and managed systems lists are displayed.
5. Select the systems and lists to which you want to distribute the situation, and then click the left
arrow to assign the situations to the systems or system lists.
6. Click Apply to save and implement the change and continue editing; click OK to apply and save the
change and close the Situation editor.
To distribute a situation by using the enhanced 3270 user interface
See the OMEGAMON enhanced 3270 user interface section in the IBM Tivoli OMEGAMON and Tivoli
Management Services on z/OS Shared Documentation.
Starting situations
Some situations you might want to run for a limited time or only under specific conditions. These
situations you should start and stop manually. Other situations you may want to run continuously. These
situations you should set to run at Tivoli Enterprise Monitoring Server startup, so they will run across
monitoring server restarts.
Initially, you might want to start situations manually to evaluate the impact of the monitoring and
monitoring interval on system performance, adjust them accordingly, and then decide if you want the
situation to run indefinitely, across Tivoli Enterprise Monitoring Server restarts.
To start a situation, right-click the situation name in the Situation editor tree and select Start from the
popup menu.
To set a situation to start automatically when the monitoring server starts:
1. Select (click) the name of the situation in the Situation editor tree.
2. The settings for the situation are displayed in the right-hand frame of the editor.
3. On the Formula tab, check Run at startup.
4. Click Apply to save and implement the change and continue editing; click OK to apply and save the
change and close the Situation editor.
To run situations at Tivoli Enterprise Monitoring Server start up, select Run at startup on the Formula tab
of the situation definition in the Situation editor.
Modifying situations
You modify situations using the Situation editor. Before activating any predefined situations, examine
the conditions and values they monitor and, if necessary, adjust them to ones better suited to your
environment.
To modify a situation:
1. Open the Situation editor from the toolbar, or right-click a Navigator entry and select
Situations... from the popup menu.
Tip: If you open the Situation editor by right-clicking a Navigator item, the situation you create is
automatically associated with that item. If you open the editor from the toolbar, you must manually
associate the new situation with a Navigator item in order to see an alert indicator when the situation
evaluates as true.
2. Use the Set Situation filter criteria to view the situations.
14 IBM OMEGAMON for z/OS: Monitoring z/OS
If necessary, check Associated with Monitored Application to see all situations that were written for
this type of agent, regardless of where they are distributed.
3. To create a copy, right-click the situation and select Create Another . . . from the popup menu.
4. Type a name for the new situation and click OK.
5. Modify the situation properties as required and click OK to save the new situation and close the
Situation editor.
Migrating situations
Preexisting situations for IBM OMEGAMON z/OS Management Console cannot be migrated. Any user-
defined situations must be re-created using the OMEGAMON for z/OS Health Check attributes.
Preexisting situations that are built for the attribute groups CF Clients and CF Policy will not trigger
events if coupling facility data is being collected by RMF. To enable these situations function, change the
threshold values in the predicates slightly using the Tivoli Enterprise Portal Situation editor and then save
and restart the situation. This causes the SQL for the situation to be rebuilt and in the process it switches
to using the new table name. You can then re-edit the situation and set the threshold back to what it was
previously.
Recreating or replacing z/OS Management Console situations
OMEGAMON for z/OS V5.1 and later is incompatible with IBM OMEGAMON z/OS Management Console. If
you previously ran z/OS Management Console situations, you can replace them with existing OMEGAMON
for z/OS situations or recreate them using OMEGAMON for z/OS attributes.
Table 2 on page 15 lists the Sysplex-level OMEGAMON z/OS Management Console situations that do not
have OMEGAMON for z/OS equivalents, along with the attribute group and situation formula you can use
to recreate them. Remember that you will need to distribute and start the new situations on the systems
where you want them to run.
Table 2. Creating OMEGAMON for z/OS situations
OMEGAMON for z/OS attribute
Situation name group Formula
KHL_CF_Paths_Problem CF Path Status==NotOperational
KHL_CF_Policy_Reformat CF Policy Reformat_Required==YES
KHL_CF_Structures_Problem CF Structures Stucture_Status==Failed
OR
Stucture_Status==RebuildF
ailed
KHL_CF_Systems_Problem CF Systems Status!=Okay
KHL_XCF_Paths_Problem XCF Paths Status!=Working
KHL_XCF_Systems_Problem XCF System Status!=Active
Table 3 on page 15 lists the OMEGAMON z/OS Management Console situations that have OMEGAMON
for z/OS equivalents. Remember that you must distribute and start the OMEGAMON for z/OS situations
if you want to use them. Check the situation formulas to make sure they use the same values as the
OMEGAMON z/OS Management Console situations you were using previously.
Table 3. OMEGAMON z/OS Management Console and OMEGAMON for z/OS equivalent situations
OMEGAMON z/OS Management Console
situation Equivalent OMEGAMON for z/OS situation
KHL_GTF_Active OS390_GTF_Active_Crit
OS390_GTF_Active_Warn
Chapter 1. Using and customizing OMEGAMON for z/OS 15
Table 3. OMEGAMON z/OS Management Console and OMEGAMON for z/OS equivalent situations
(continued)
OMEGAMON z/OS Management Console
situation Equivalent OMEGAMON for z/OS situation
KHL_HealthChecker_Problems KM5_HealthChecker_Problems
KHL_High_Severity_Check KM5_High_Severity_Check
KHL_OLTEP_Active OS390_OLTEP_Active_Crit
OS390_OLTEP_Active_Warn
KHL_Paging_Dataset_Utilization OS390_PLPA_PgeDS_PctFull_Crit
OS390_PLPA_PgeDS_PctFull_Warn
KHL_RMF_Problem OS390_RMF_Not_Active_Crit
OS390_RMF_Not_Active_Warn
KHL_SMF_Problem OS390_SMF_Not_Recording_Crit
OS390_SMF_Not_Recording_Warn
KHL_Syslog_Problem OS390_Syslog_Not_Recording_Crit
OS390_Syslog_Not_Recording_Warn
Note: OMEGAMON for z/OS does not currently have an equivalent for KHL_AddressSpace_Waiting.
Finding more information about situations
For descriptions of all the predefined situations shipped with OMEGAMON for z/OS, including definitions
and advice, see Sysplex situations and System situations.
For more information about creating and modifying situations, see the Tivoli Enterprise Portal User's Guide
in the IBM Tivoli Monitoring documentation and the Tivoli Enterprise Portal online help.
Using historical data collection and reporting
In addition to monitoring and displaying real-time data, OMEGAMON for z/OS can collect and view data
over extended periods of time.
OMEGAMON for z/OS provides the capability to collect near-term history data from Resource
Measurement Facility (RMF) - also referred to as RMF near-term history data - and to collect and store
historical data using IBM Tivoli Monitoring - referred to as historical data.
RMF near-term history data
You can view the near-term history data from RMF in OMEGAMON Enhanced 3270 user interface
workspaces. Summary workspaces show data over a time period, for example, 2 hours. Detail workspaces
display data for a particular time in the recent past. You can move forward and backward in time to view
more time periods.
In order for RMF near-term history data to be available, you must enable the use of RMF data collection:
see Configuring the OMEGAMON for z/OS agent to use RMF data and Enable RMF data collection. Near-
term history data is either all or no attribute groups enabled.
Historical data stored by IBM Tivoli Monitoring
You can view the historical data stored by IBM Tivoli Monitoring in OMEGAMON for z/OS workspaces in
the Tivoli Enterprise Portal. Table and chart views for which historical data collection is enabled have a
tool for setting a time span. You can see up to 24 hours of previously collected data. If you configured
data warehousing, you can view samples for longer periods of time.
16 IBM OMEGAMON for z/OS: Monitoring z/OS
In order for historical data stored by IBM Tivoli Monitoring to be available in the Tivoli Enterprise Portal,
you must configure and start historical data collection for the appropriate attribute groups using the
Tivoli Enterprise Portal (see “Configuring historical data collection” on page 17). In addition, to store
short-term historical data for agents that run on z/OS or report to monitoring servers on z/OS, data sets
must be allocated in the persistent data store, and maintenance of the data store must be configured.
To warehouse data, DB2® or Microsoft SQL Server must be installed and your environment must be
configured to include the Warehouse Proxy agent and Tivoli Data Warehouse.
For information on setting up the persistent data store and configuring maintenance, see:
• the Configuring section in the IBM Tivoli OMEGAMON and Tivoli Management Services on z/OS Shared
Documentation
• Configure historical data collection.
For information about installing and setting up the Tivoli Data Warehouse and the Warehouse Proxy agent,
see the Installation and Configuration Guides in the IBM Tivoli Monitoring documentation.
You can also export the historical data from the Tivoli Enterprise Portal to delimited flat files for use with
third-party reporting tools to produce trend analysis reports and graphics. Data warehoused to the Tivoli
Data Warehouse, a relational database, can be used to produce customized history reports.
Configuring historical data collection
You configure historical data collection using the History Collection Configuration dialog box in the
Tivoli Enterprise Portal.
Note: Near-term history data is enabled independently from historical data stored by IBM Tivoli
Monitoring.
Configuration is done on an attribute-group by attribute-group basis. This allows you to configure
collection for different attribute groups at different intervals so important volatile data may be collected
more often, while less dynamic data can be collected less frequently.
Not all attribute groups can collect historical data. This is because collecting history data for these
attribute groups is not appropriate or would have a detrimental effect on performance. For example,
collection might generate unmanageable amounts of data. Only those attribute groups for which data can
be collected are listed in the Configuration dialog box.
Note that for a given attribute group, the same history collection options are applied to all Tivoli Enterprise
Monitoring Servers for which collection for that attribute group is currently enabled. You cannot specify
different intervals for the same attribute group for different monitoring servers.
Chapter 1. Using and customizing OMEGAMON for z/OS 17
Figure 1. History Collection Configuration dialog box
Starting and stopping historical data collection
You start and stop historical data collection for individual attribute groups from the History Collection
Configuration dialog box.
Note: Near-term history data is enabled independently from historical data stored by IBM Tivoli
Monitoring.
In the Select Attribute Group table, select the attribute group or groups for which you want to change
collection status, then press the appropriate button. Collection continues until the agent or monitoring
server is stopped or recycled.
Reducing the impact of requests from large tables
Requests for historical data from tables that collect a large amount of data have a negative impact on
the performance of the product components involved. To reduce the performance impact on your system,
set a longer collection interval for attribute groups that collect a large amount of data, in particular the
Address Space groups, the DASD MVS Devices group, and the Enqueue group (for sites that are active
with WebSphere®). You specify the collection interval from the Configuration tab of the History Collection
Configuration dialog.
Note: No data is collected for Sysplex-level shared DASD unless a DASD filter situation has been
activated. This is to prevent high CPU and storage problems caused by the large amount of data that
may be generated with large DASD volume counts. For more information, see “Monitoring shared DASD”
on page 45.
18 IBM OMEGAMON for z/OS: Monitoring z/OS
When you are viewing historical data, set the Time Span interval to the shortest time span setting
sufficient to provide the information you need, especially for tables that collect a large amount of
data. Selecting a long time span interval for the report time span increases the amount of data being
processed, and may have a negative impact on performance. The program must dedicate more memory
and processor cycles to process a large volume of report data.
If the amount of information requested is too large, the agent may take too long to process the request
and the request may time out. However, the agent continues to process the report data to completion,
and remains blocked, even though the report data is not viewable.
Using summarization and pruning can also help reduce the amount of historical data you store and
request. For more information about using these features, see the User's Guide in the IBM Tivoli
Monitoring documentation and the Administrator's Guide in the IBM Tivoli Monitoring documentation.
Finding more information about historical data collection
For more information on configuring historical data collection and reporting in Tivoli Enterprise Portal, see
the Tivoli Enterprise Portal online Help and IBM Tivoli Monitoring: Tivoli Enterprise Portal User's Guide.
For more information on allocating data sets and configuring the persistent data store, see Configure
historical data collection.
For information on maintaining the persistent data store, exporting historical data to flat files, and
warehousing historical date, see the Planning and Configuring sections of the IBM OMEGAMON and
Tivoli Management Services on z/OS: Shared documentation and the IBM Tivoli Monitoring: Administrator's
Guide.
For information on configuring the Tivoli Data Warehouse, the Warehouse Proxy Agent, and the
Summarization and Pruning Agent, see IBM Tivoli Monitoring: Installation and Setup Guide.
Using Take Action commands
You can use the Tivoli Enterprise Portal Take Action feature to enter a command or to stop or start a
process on any system in your network where one or more monitoring agents are installed, and you
can add Take Action commands to any situations you create using OMEGAMON for z/OS attributes. By
default, any command issued on behalf of OMEGAMON for z/OS is issued as a z/OS command. However,
by prefixing the command, you can cause it to be issued as a UNIX command or as an OMEGAMON for
z/OS command.
Issuing OMEGAMON for z/OS commands
Using the prefix M5:, you can issue a set of product-provided commands to terminate or alter the
execution attributes of an address space.
Use the M5: prefix to issue any of the following commands as a Take Action command. Note that the
colon (:) following the prefix is required.
Command name Description Required parameters
Kill Kill a job or user with a system A22 Address space name, address space
abend. ID (ASID), or hexadecimal address
space ID (ASIDX)
SwapIn Swap in ASID, ASIDX
MarkSwappable Mark swappable ASID, ASIDX
MarkNonSwappable Mark nonswappable ASID, ASIDX
ChangeTimeLimit Change step CPU time limit ASID or ASIDX, Time
The syntax for the commands is:
Chapter 1. Using and customizing OMEGAMON for z/OS 19
M5:command_name,keyword1=value1,keyword2=value2,...
The following are the valid keywords:
ASName
Value is an address name of up to 8 characters.
ASID
Valid value is a decimal ASID number.
ASIDX
Valid value is a hexadecimal ASID number.
Time
The value is expressed in the following format: {+|-}nnnnS|M|H, where:
• + adds time
• - subtracts time
• Absence of + and - sets the CPU time limit to the value specified
• nnnnS is the time in seconds
• nnnnM is the time in minutes
• nnnnH is the time in hours
Neither the command names nor the keywords are case-sensitive.
For more information on using Take Action commands, see the Tivoli Enterprise Portal User's Guide in the
IBM Tivoli Monitoring documentation or the Tivoli Enterprise Portal online help.
Examples
For example, the following command kills job JFTESTXX, ASID 64:
M5:Kill ASName=JFTESTXX,ASID=64
The following command makes ASID 64 nonswappable:
M5:MarkNonSwappable ASID=64
The following command adds 60 minutes to the step CPU time limit:
M5:ChangeTimeLimit ASID=64,Time=+60M
The following command set the step CPU time limit to 45 seconds:
M5:ChangeTimeLimit ASID=64,Time=45
Issuing UNIX commands
Using one of several prefixes, you can issue a UNIX program name as a command. You can also issue
UNIX shell commands.
Use any of the following prefixes to issue a Take Action command as a UNIX command. The colon (:)
following the prefix is required.
• OMVS:, Omvs:, or omvs:
• UNIX:, Unix:, or unix:
Thus, for example, D OMVS is issued as a z/OS command. Alternatively, omvs:ps -ef is issued as a UNIX
command. As with z/OS commands, the only result returned is whether or not the command appears to
have started successfully.
Important: The user ID of the address space where the OMEGAMON for z/OS product is defined (the
Tivoli Enterprise Monitoring Server address space) must be defined to z/OS UNIX System Services and
have superuser authority in order to collect UNIX data and relay UNIX commands. See “Authorizing
users to issue UNIX commands” on page 22 and the IBM Tivoli OMEGAMON for z/OS: Planning and
Configuration Guide for more information on defining users to UNIX System Services. In addition, Tivoli
Enterprise Portal user IDs must be authorized to issue UNIX commands.
20 IBM OMEGAMON for z/OS: Monitoring z/OS
Environment for issuing UNIX commands
UNIX commands are issued in an environment with particular characteristics.
• No terminal is available. Some commands, such as ps issued without options, will not work in this
environment since they are designed to have a current terminal. Note that, in the case of ps, some
options, for example, -A, enable ps to execute without requiring a terminal.
• The specified command may be any shell command or UNIX program, including any REXX program
written for the shell, that does not require a terminal or a particular environmental condition (for
example, a specific user ID).
• stdin is initially assigned to /dev/null (the equivalent of an empty file).
• stdout and stderr are initially assigned to /dev/console (the z/OS SYSLOG).
• stdin, stdout, and stderr can be redirected following standard UNIX redirection conventions.
• The shell is executed in a new process in a separate address space. This insulates OMEGAMON for z/OS
from the effects of the command and vice versa.
• The shell has the same authorization as OMEGAMON for z/OS.
• The initial working directory is /.
• The HOME environment variable is set to /.
• No other environment variables are set before the shell is started. shell/bin/sh is used to issue the
specified command.
• The shell performs normal login profile processing starting with /etc/profile before issuing the
specified command. This can result in further environment variables being set before the specified
command is issued. If profile processing terminates the shell before the specified command has been
issued, for example, by issuing the exit shell command, the specified command is not issued.
• The specified command will not terminate when OMEGAMON for z/OS terminates if OMEGAMON for
z/OS terminates first.
• Termination, abnormal or otherwise, of the specified command will not cause OMEGAMON for z/OS to
terminate.
• OMEGAMON for z/OS does not maintain or report any status information about the shell or specified
command other than that collected as part of its normal system monitoring functions.
Redirecting UNIX commands output
By default, the output of both z/OS and UNIX commands is written to the z/OS system log. You cannot
redirect the output of a z/OS command. However, you can redirect both the input stream and output of a
UNIX command by following standard UNIX redirection conventions.
For example, the command omvs:ps -ef>/tmp/myoutput sends the output of the ps command to
a file called /tmp/myoutput. Redirect command output to a file for later examination and to avoid
cluttering the z/OS log.
Running commands in the background
Each UNIX command is run as a process in a separate address space using the /bin/sh shell. When
OMEGAMON for z/OS is used to start a long-running UNIX command, you may notice an address space
that persists until the command ends. This address space is in addition to the one running OMEGAMON
for z/OS and the one running the command itself. You can avoid the extra address space by running the
command in the background.
To run the command in the background, end the command line with the UNIX shell symbol &
(ampersand).
Chapter 1. Using and customizing OMEGAMON for z/OS 21
Testing commands
Before using commands as part of situations or policies, you should run some simple tests from the user
interface to ensure that commands are working as expected.
For example, issuing the omvs:set>/tmp/cantest Take Action command results in the output of the
set command being placed in the /tmp/cantest file.
Authorizing users to issue UNIX commands
By default, only user IDs that have been defined to z/OS UNIX System Services and that have superuser,
or root, authority are allowed to issue UNIX commands through the Tivoli Enterprise Portal.
Users are defined to z/OS UNIX using RACF® commands. The z/OS UNIX attributes are kept in the OMVS
segment of the RACF user’s profile. This means that to issue UNIX commands:
• The user's Tivoli Enterprise Portal user ID must be defined in RACF.
• The profile associated with the RACF user ID must contain an OMVS segment.
• In the OMVS segment, the z/OS UNIX user identifier (UID) must have a value of 0 (superuser).
You can override the default validation behavior by adding one of two parameters to the KDSENV member
of &shilev.&rtename.RKANPARU on the system or LPAR on which the command is being executed:
• You can allow any RACF user ID defined to z/OS UNIX System Services to issue UNIX
commands, regardless of level of authorization, by adding the variable KOE_ALLOW_ANY_UID=1 to
&shilev.&rtename.RKANPARU(KDSENV) on the LPAR where the command is to be executed.
• You can allow any RACF user ID to issue UNIX commands, whether or not it has been
defined to z/OS UNIX System Services, by adding the variable KOE_ALLOW_UNDEFINED=1 to
&shilev.&rtename.RKANPARU(KDSENV) on the LPAR where the command is to be executed.
If you want any user with a Tivoli Enterprise Portal user ID to be able to issue UNIX commands, add both
KOE_ALLOW_ANY_UID=1 and KOE_ALLOW_UNDEFINED=1 parameters.
Tip: For commands used in situations and policies, the user ID verified is the ID of the user who last saved
the situation.
Memory list and memory zap
You can display and zap memory from the OMEGAMON Enhanced 3270 user interface. You can use this
feature to view the programs running in an address space; change the instructions in a running application
to correct a problem; and view or change data that is being processed by the application.
When you configure OMEGAMON for z/OS, use a security class other than OMEGDEMO. This is
because OMEGDEMO bypasses security checking by the OMEGAMON Enhanced 3270 user interface
and OMEGAMON for z/OS. When you specify a security class other than OMEGDEMO, users are denied
access to memory list and memory zap by default. To enable memory list and memory zap, additional SAF
definitions are required: see Security for memory list and memory zap.
For information on enabling security for the OMEGAMON Enhanced 3270 user interface, see Enable
security for the OMEGAMON enhanced 3270 user interface.
For information on authorizing users to issue Take Action commands, see Prefixed Take Action
commands.
Solving problems with memory list and memory zap
1. In the OMEGAMON Enhanced 3270 user interface, navigate to View > Memory
2. Specify the address space for which you want to display the memory.
The following image displays the TCB Storage and LSQA for a selected address space.
22 IBM OMEGAMON for z/OS: Monitoring z/OS
3. Select a row, and then navigate to the TCB Storage by Subpool and Storage Key workspace. Note the
addresses of storage that might be interesting.
4. Navigate to the Memory Display/Zap workspace, and then type the address in which you are
interested.
5. Type Z to display the ZAP memory pop-up.
Chapter 1. Using and customizing OMEGAMON for z/OS 23
6. Type a replacement string, and then press Enter.
OMEGAMON for z/OS zaps the memory, and then displays a confirmation message.
7. Press Enter again to display the storage that you updated.
Security for memory list and memory zap
The memory list and memory zap functions require two separate layers of SAF security:
• You must define GENERIC resource profiles need for the MEMLIST and MEMZAP functions which
are implemented as TAKEACTION commands in the OMEGAMON for z/OS’s security class. For more
information, see “GENERIC resource profiles” on page 24.
• PassTicket authentication between the enhanced 3270UI and the OMEGAMON for z/OS agent enforces
a secure sign-on for the memory list and memory zap functions. For more information, see “PassTicket”
on page 25.
GENERIC resource profiles
Security for OMEGAMON for z/OS Take Action commands is based on SAF security classes and resource
profile names. During product configuration, you specify the name of the SAF security class that is used
to validate the Take Action commands by setting the RTE_SECURITY_CLASS parameter in the PARMGEN
parameter deck.
If you want a separate SAF security class just for OMEGAMON for z/OS Take Action commands, set the
KM5_SECURITY_ACTION_CLASS parameter.
If you are using RACF as your SAF product, use the following command to activate the SAF security class
and the SETROPTS RACLIST processing:
SETROPTS CLASSACT(class_name) RACLIST(class_name) GENERIC(class_name)
After you define the SAF security class, create resource profiles to control access to the Take Action
commands. (If you do not create any resource profiles to control Take Action commands, all commands
are denied.)
To see if users are authorized to use the Take Action commands, the OMEGAMON Enhanced 3270 user
interface checks the following resource profile:
KM5.msn.TAKEACTION.*
where msn is the managed system name.
24 IBM OMEGAMON for z/OS: Monitoring z/OS
You must create a profile for the global security class (RTE_SECURITY_CLASS), and then give update
access to the profile to all users whom you want to use any Take Action commands from the OMEGAMON
Enhanced 3270 user interface. The OMEGAMON Enhanced 3270 user interface address space uses SAF
validation to determine whether a user is authorized to issue any Take Action commands. SAF validation
for product specific commands is performed by the monitoring agent.
For more granular access control, create more resource profiles. For example:
• To control all OMEGAMON for z/OS Take Action commands on all managed systems, define a profile
named KM5.**.TAKEACTION.*, and then grant access to one or more users or groups:
RDEFINE class_name KM5.**.TAKEACTION.* UACC(NONE)
PERMIT KM5.**.TAKEACTION.* ID(user_id) ACCESS(UPDATE) CLASS(class_name)
• To control the ability to issue all OMEGAMON for z/OS Take Action commands from a monitoring agent
that is running on a system with an SMFID of SYSA and Sysplex name of PLEXA, define a profile named
KM5.PLEXA:SYSA.TAKEACTION.*, and then grant access to one or more users or groups:
RDEFINE class_name KM5.PLEXA:SYSA.TAKEACTION.* UACC(NONE)
PERMIT KM5.PLEXA:SYSA.TAKEACTION.* ID(user_id) ACCESS(UPDATE) CLASS(class_name)
The memory list and memory zap features use these Take Action resource profiles:
• MEMLIST controls the ability to display memory.
• MEMZAP controls the ability to change memory contents.
For example:
• To control the ability to display memory from all address spaces running on systems where an
OMEGAMON for z/OS agent is running, define a profile named KM5.**.TAKEACTION.MEMLIST, and
then grant access to one or more users or groups:
RDEFINE class_name KM5.**.TAKEACTION.MEMLIST UACC(NONE)
PERMIT KM5.**.TAKEACTION.MEMLIST ID(user_id) ACCESS(UPDATE) CLASS(class_name)
• To control the ability to modify memory of all address spaces running on the same system as an
OMEGAMON for z/OS monitoring agent that is running on a system with an SMFID of SYSA and Sysplex
name of PLEXA, define a profile named KM5.PLEXA:SYSA.TAKEACTION.MEMZAP, and then grant
access to one or more users or groups:
RDEFINE class_name KM5.PLEXA:SYSA.TAKEACTION.MEMZAP UACC(NONE)
PERMIT KM5.PLEXA:SYSA.TAKEACTION.MEMZAP ID(user_id) ACCESS(UPDATE) CLASS(class_name)
When you have added all the GENERIC resource profile definitions to the security class, issue the
following command to refresh the security class and activate the changes:
SETROPTS RACLIST(class_name) REFRESH
PassTicket
Requests to display memory or zap memory require a secured sign-on from the OMEGAMON Enhanced
3270 user interface to the OMEGAMON for z/OS monitoring agent.
The OMEGAMON Enhanced 3270 user interface generates a PassTicket (that is, a one-time only
password), and then sends it to the OMEGAMON for z/OS monitoring agent in the data request. This
enables the monitoring agent to authenticate that the request came from the user who is logged into the
interface.
1. To enable a PassTicket to be generated, the PTKTDATA security class must be activated. To activate
the PTKTDATA class and the SETROPTS RACLIST processing, issue the following command:
SETROPTS CLASSACT(PTKTDATA) RACLIST(PTKTDATA) GENERIC(PTKTDATA)
Chapter 1. Using and customizing OMEGAMON for z/OS 25
2. The PassTicket key class enables the security administrator to associate a RACF secured sign-on
secret key with an application that uses RACF for user authentication. All profiles that contain
PassTicket information are defined to the PTKTDATA class.
Define a profile in the PTKTDATA class definition for each OMEGAMON for z/OS monitoring agent which
you want to enable for memory list and/or memory zap functions. The KEYMASKED value may be any
combination of 16 hex digits; in the example below, it is 0123456789ABCDEF.
RDEFINE PTKTDATA zOSAgent_STC SSIGNON(KEYMASKED(0123456789ABCDEF))
3. Grant users and groups access to an OMEGAMON for z/OS profile:
PERMIT zOSAgent_STC CLASS(PTKTDATA) ID(user_id) ACCESS(UPDATE)
4. Each OMEGAMON for z/OS monitoring agent must also have a resource profile defined to the RACF
application class (APPL). The same users and groups permitted to the OMEGAMON for z/OS monitoring
agent’s PTKTDATA profile should also be permitted to the agent's profile defined to the APPL class. The
following definitions apply to the APPL class.
To activate the APPL class and the SETROPTS RACLIST processing, issue the following command:
SETROPTS CLASSACT(APPL) RACLIST(APPL)
Define a profile in the APPL class for each OMEGAMON on z/OS monitoring agent which you wish to
enable for memory list and/or memory zap functions:
RDEFINE APPL zOSAgent_STC UACC(NONE)
Grant user’s access to the OMEGAMON on z/OS monitoring agent’s APPL profile:
PERMIT zOSAgent_STC CLASS(APPL) ID(user_id) ACCESS(UPDATE)
5. When you have added all the OMEGAMON for z/OS monitoring agent’s resource profile definitions to
the PTKTDATA and APPL security classes, issue the following command to refresh the security classes
and activate the changes:
SETROPTS RACLIST(PTKTDATA) REFRESH
SETROPTS RACLIST(APPL) REFRESH
26 IBM OMEGAMON for z/OS: Monitoring z/OS
Chapter 2. Using the Inspect function
Inspect is a diagnostic tool whose primary purpose is to help you understand where, within an address
space, code is spending its time. You can then use that information either to optimize the code, or to
identify where within the code a program might be looping. You might use the Inspect function, for
example, when a workspace or situation event shows an address space with high processor usage.
Invoking the Inspect function
You run the Inspect function by selecting an Inspect link from the Address Space CPU Utilization view in
Address Space CPU Utilization workspace. The parameters Inspect uses to collect the data are specified
in the link definition, and the resulting data for the selected address space is displayed in the Inspect
Address Space CPU Use workspace.
Two Inspect links are available from the Address Space CPU Utilization table:
• Inspect Address Space CPU Use
This link uses default parameters of 1000 samples at 5 millisecond intervals.
• Inspect with 5000 samples at 2ms interval
This link uses the specified parameters (5000 samples at 2 millisecond intervals), but it is intended to
be used as a template to set your own sample count and sampling interval.
Modifying the template Inspect link
To specify the sample count and sampling interval used to collect Inspect data, modify the template
Inspect link (Inspect with 5000 samples at 2ms interval).
Tip: The Inspect Address Space CPU Use workspace is not populated with data until the Inspect agent
completes on the target system. The time it takes to complete is a function of the number of samples and
the sampling interval specified in the link definition. For example, taking 1000 samples at a 5-millisecond
interval (the default settings) requires 5 seconds for the data collection process to complete. When you
are selecting the values for number of samples and the sampling intervals, bear in mind that if the total
time taken to execute the agent exceeds the client timeout value, the Tivoli Enterprise Portal will return
no data, even if the agent subsequently completes normally.
To modify the template Inspect link:
1. Open the Address Space Overview workspace for the target system.
2. In the Address Space Counts table, right-click the link button and select Address Space CPU
Utilization from the pop-up window.
3. After the Address Space CPU Utilization workspace is displayed, right-click the link button for a
row in the Address Space CPU Utilization table and select Link Wizard from the popup menu.
The Workspace Link Wizard editor is displayed.
4. Select Modify an existing link, and then click Next.
© Copyright IBM Corp. 2004, 2022 27
5. In the Link to Modify window, select Inspect with 5000 samples at 2ms interval, and then
click Next.
6. In the Link Name window, type a new name and description for the link, if you want, and then click
Next.
28 IBM OMEGAMON for z/OS: Monitoring z/OS
The Target Filters window is displayed.
7. Click Next to display the Parameters window.
8. In the Parameters window, select the INTERVAL row, and then click Modify Expression.
Chapter 2. Using the Inspect function 29
The Expression editor opens.
9. In the Expression editor, type the interval, in milliseconds, at which you want Inspect to collect
samples, and then click OK.
The Parameters window is redisplayed.
10. In the Parameter tree, select the SAMPLES row, and then click Modify Expression.
30 IBM OMEGAMON for z/OS: Monitoring z/OS
11. In the Expression editor, type the number of samples you want Inspect to use in deriving the data,
and then OK.
The Parameters window is re-displayed with the changes.
12. Click Next.
The wizard asks you to review the changes. If they are correct, click Finish to save the changes and
close the wizard.
If you right-click a in the Address Space CPU Utilization table, you see the name you assigned to the
modified link.
The new name, if any, and the parameters you set persist until you close the workspace.
The Inspect Address Space CPU Use workspace
The results of the inspection are displayed in the Inspect Address Space CPU Use workspace. The
workspace contains three views: Sampling Statistics, Agent Messages, and Inspect Data.
The Sampling Statistics table view shows the number of samples requested, the sampling interval in
milliseconds, the number of samples collected, and the number of samples used. Typically, the number of
samples collected is the same as the number requested. However, if the job being inspected ends before
the Inspect agent has finished collecting data, the number of samples collected is the number collected
up to the point where Inspect detected that the target job had ended.
The number of samples used is the number of times that the Inspect agent saw CPU activity in the target
address space and gives you some indication as to the statistical accuracy of the resultant inspect data.
The number of samples used value does not represent the number of rows of Inspect data.
The Agent Messages view displays any error or informational messages returned by the Inspect agent.
These messages help to explain the resultant data (or lack thereof) that you see in the other views. For
example, if no CPU activity was seen by Inspect in the address space being inspected, the agent returns
a message indicating that; the number of samples used column in the Sampling Statistics view would be
zero; and no data would be displayed in the Inspect Data view.
Chapter 2. Using the Inspect function 31
The Inspect Data view contains the output from the inspection process. The Inspect agent returns data
only for elements for which it saw CPU activity. The data is ordered in descending CPU activity order with
the following hierarchy:
Task control block (TCB)
Load module within TCB
Control sector (CSECT) within load module
Block of code within CSECT
Task control block
For each TCB for which it sees processor activity, Inspect attempts to determine the executing load
modules consuming the processor time (CPU). For each load module, Inspect then attempts to map
the CSECT structure of the load module and assign the load module processor time to the appropriate
CSECTs. This allows you to determine which load modules and CSECTs within the load module are
consuming processor time.
Inspect maps the CSECT structure for each load module (with processor activity) by scanning the target
address space for load libraries and attempting to read in the load module from each library in turn. It
also scans SYS1.LINKLIB, SYS1.NUCLEUS and SYS1.LPALIB for load modules. If Inspect cannot locate
the load module, the CSECT name is unknown and the entire load module is considered to be one large
CSECT.
Inspect then further breaks down the CPU time attributed to each CSECT into blocks of code, the size
of which is calculated by Inspect once the data collection process has completed. In order to prevent
Inspect from flooding the client workspace with rows of data, the Inspect agent attempts to calculate
a granularity (block) size that will limit the number of rows of data returned to about 100, but where
possible the Inspect agent will use a granularity size of 16 bytes (0X00000010). The granularity size used
is displayed in the Agent Messages view of the Inspect CPU Use workspace. The granular data is shown in
the rightmost two columns of the Inspect Data view. Again, these are displayed in descending processor
use order with the most active blocks of code within each CSECT being at the top of the display rows for
each CSECT.
Tip: Before you examine the Inspect data, review the Samples Used field in the Sampling Statistics view.
This field indicates the statistical validity of the sampled data. A low number of samples indicates that
the Inspect data may not give a truly representative view of where the code in the target address space is
spending its time. Also review any messages in the Agent Messages view that may indicate that the data
might be incomplete, or explain why there is no data. You can find descriptions of the Inspect messages in
the online help.
For descriptions of the information in the columns of these tables, see Inspect Address Space CPU Use
attributes.
Understanding the Inspect data–an example
In the Inspect Data view, the data is organized in descending processor (CPU) by task control block (TCB)
order, so the most active items are first in the display.
In the following example, Inspect saw one TCB active. The program that was attached to create this TCB
was called KLV. Because Inspect saw only one TCB active, 100% of the processor time that Inspect saw
being used is attributed to this TCB. The TCB Ended column is blank, indicating that this TCB did not end
while Inspect was running.
Inspect saw two load modules in use during its sampling activities, KM5AGENT and IEAVXPCA. Since
KM5AGENT is first, this was the most active load module. However, you can also see that the second load
module spent some of its time executing within the PCAUTH address space, as shown by the Load Module
ASID Hex and Load Module Jobname columns.
32 IBM OMEGAMON for z/OS: Monitoring z/OS
TCB Address Initial TCB CPU % TCB Load Module Load Module Load Module
Program of job Ended Name ASID Hex ASID Jobname
0X007F83D0 KLV 100.0 KM5AGENT
IEAVXPCA 0X0002 PCAUTH
The Load Module Address column shows where in storage in the target address space the load module is
loaded. Following that are two columns which show how the CPU time used by this load module breaks
down as a percentage of all the CPU time used by the address space (Load Module CPU % Job) and of
this TCB (Load Module CPU % TCB). Because there is only one TCB in this instance, these numbers are
the same for each load module. (If there were more TCBs active you see the CPU time used by each load
module as a percentage of both the overall job and its owning TCB.)
Load Module Address Load Module CPU % of TCB Load Module CPU % of job
0x12E8F670 73.3 73.3
0X123020A0 26.6 26.6
Next, the CSECT Name column shows that in this instance all the CPU time seen by Inspect was being
consumed by one CSECT, in this case KM3PBM1.
The CSECT Offset in Load Module shows amount of offset within the load module of this CSECT, and
the CSECT Address column shows the address of the CSECT within the target address space storage.
You could use this information to locate the CSECT within a dump of the address space, for example to
confirm an eye catcher that might contain a compile date or time or some other version information.
In the next columns, the CPU time attributed to the CSECT is displayed as a percentage of the total CPU
time for the job, the load module, and the TCB. This can help you understand how much the CSECT is
being used overall, by each load module, and by each task within the address space. This can help you to
identify the CSECTs that are most heavily used.
CSECT Name CSECT Offset CSECT Address CSECT CPU % CSECT CPU % CSECT CPU %
in load module of Load of TCB of Job
Module
KM3PBM1 0X000A9580 0X12F38BF0 100.0 73.3 73.3
IEAVXRFE 0X0005AB50 0X12307B50 100.0 26.6 26.6
In the final columns, the granular level data is shown. This breaks each CSECT down into blocks of code,
based on the granularity size calculated by Inspect. The granularity size used is shown in the Agent
Messages view. This allows you to further refine where within each CSECT the code is spending its time
and identify areas that may be looping or that might benefit from optimization.
Chapter 2. Using the Inspect function 33
Offset in CSECT CPU% of CSECT
0X00000AE0 36.3
0X00000BD0 33.0
0X00000CD0 30.6
0X00000AA0 100.0
To understand how the granular level data relates to the program source (section of code), you can then
refer to a link editor and compile listings.
34 IBM OMEGAMON for z/OS: Monitoring z/OS
Chapter 3. Using dynamic links to OMEGAMON for
MVS
A Tivoli Enterprise Portal terminal view enables you to connect to any TN3270 host system with TCP/IP
from inside a workspace. Terminal views also have scripting capability with record, playback, and
authoring of entire scripts. If you associate a terminal view with a connection script and a query that
returns appropriate values, you can configure a view that opens to a specific panel of a 3270 application.
OMEGAMON for z/OS has taken advantage of this capability to create dynamic links from several existing
workspaces to target workspaces that display related OMEGAMON for MVS screens.
The data used to connect to OMEGAMON for MVS and to navigate to the target screens is retrieved from
variables specified during configuration of OMEGAMON for z/OS monitoring agents and from attributes
identified in the dynamic link.
Like the product-provided situations, you can use the predefined workspaces, queries, and scripts used to
create these links as models or templates for creating additional links. You can also create your own links
and target workspaces from scratch.
How dynamic linking works
Dynamic linking involves the following components:
• A source workspace that provides the context (attribute values) for the link
• A target workspace that contains a terminal emulator view associated with a query and a suitable
connection and navigation script
• A dynamic link from the source workspace to the target workspace that specifies the attribute values
from the source workspace that are used for navigation to the appropriate screen.
• A query that retrieves additional required values from the OMEGAMON agent to be used by the script
associated with the terminal adapter
• A script that uses the values from the query and the link to connect to OMEGAMON for MVS session and
navigate to the appropriate screen
When a dynamic link on a row in the source workspace is selected, the link picks up the values of the
attributes specified in the link definition from the row and passes them to the target workspace. The query
associated with the terminal view in the target workspace is executed to obtain connection information.
The query is processed by the monitoring agent and discovered or configured information is passed back
to the terminal emulator.
When a Tivoli Enterprise Portal user attempts to connect to a session, a TN3270 logon window prompts
for a user ID and password. These credentials can be saved for the duration of the Tivoli Enterprise
Portal session so they will not have to be prompted for again. They are lost when the Tivoli Enterprise
Portal client is terminated. If the script associated with the session is modified to include the user ID and
password for the OMEGAMON session, or if no user ID or password is required for the logon, the window
can be dismissed without them.
Note: The stored user credentials are associated with the specific OMEGAMON session being linked to,
as defined by its host name, port number, LU Group and APPLID. If the same terminal emulator (and
workspace) is used to connect to a different classic instance, the user is prompted again for another user
ID and password.
After a user clicks OK to close the prompt, the startup script file is run. The script file retrieves all the
query values, link values, and any values modified during logon and uses this information to drive the
3270 interface to connect to the APPLID returned in the query and to navigate to the appropriate display
in OMEGAMON for MVS.
The connection values for the OMEGAMON for MVS session (host name, port number, Logical Unit (LU)
Group and APPLID) are discovered by the monitoring agent. However, these values can be overridden
© Copyright IBM Corp. 2004, 2022 35
using PARMGEN. For more information about configuring OMEGAMON for MVS parameters, see Overview
of configuration parameters. The host to which the TN3270 session connects must have an active
TN3270 listener. By default, that host is assumed to be the LPAR on which the monitoring agent is
located. If there is no active listener, the address of an LPAR that does have an active listener must be
specified. The default port number for the Telnet listener is 23. This value can also be overridden. The
Dynamic XE to 3270 (Classic) linking feature requires the VTAM® UNIX System Services screen to accept
a LOGON APPLID() DATA() command. If the default Telnet UNIX System Services screen does not
accept this command, the name of a Logical Unit (LU) group that does accept it must be provided. The
TN3270 session will be joined to that LU group. The default values are overridden on the Add Runtime
Environment (3 of 3) or Update Runtime Environment (3 of 3) panel.
The default values or the override values used for the session are displayed at the TN3270 logon and can
be modified for an individual TN3270 session.
The terminal connection terminates after 5 minutes of inactivity or when you end this Tivoli Enterprise
Portal work session.
Product-provided links
OMEGAMON for z/OS provides seven launch points to four target workspaces. These links save you
several time-consuming steps and the cutting and pasting of data required to navigate to the OMEGAMON
for MVS screens that you require.
For example, suppose an alert is raised by OMEGAMON for z/OS in the Tivoli Enterprise Portal indicating
there is a common storage shortage of CSA. Instead of logging on to an OMEGAMON for MVS TN3270
session and drilling down to the appropriate information, you can review the relevant information in
the OMEGAMON for z/OS workspaces and then log on to OMEGAMON for MVS via TN3270 to display
orphaned CSA storage and release it, if appropriate, freeing up this valuable resource and avoiding CSA
shortage problems that might result in an IPL.
OMEGAMON for MVS screens that require level 3 authority or that are high CPU users were avoided as
targets of these links. Like the product-provided situations these links are meant to be useful examples
that can be built on (see “Creating new dynamic links” on page 36).
Table 4. Product-provided dynamic links
From To
Address Space Common Storage – Orphaned OMEGAMON for MVS – CSA Analyzer workspace
Elements workspace
LPAR Clusters workspace OMEGAMON for MVS – LPAR PR/SM Processor
Statistics workspace
System CPU Utilization workspace OMEGAMON for MVS – License Manager MSU and
WLM Capping
Address Space Overview workspaceAddress OMEGAMON for MVS – Job Details workspace
Space Bottlenecks Summary workspace
Address Space CPU Utilization workspace
Address Space Storage workspace
Creating new dynamic links
The process of creating a dynamic OMEGAMON to 3270 link can be summarized as follows:
• A target workspace is created with one or more terminal adapter (3270 emulator) views.
• A script is created and associated with the terminal emulator view.
• A query is associated with the terminal adapter to fetch connection values from the monitoring agent.
• The link is created from the source report to the target workspace that includes the values from the
source report that will be needed for the script invoked by the terminal adapter.
36 IBM OMEGAMON for z/OS: Monitoring z/OS
Instructions for creating new links by modifying product-provided workspaces and scripts:
1. “Create the target workspace” on page 37.
2. “Modify the associated script” on page 38.
3. “Define the dynamic link” on page 40
If you want to create new workspaces and scripts, follow the instructions in the Tivoli Enterprise Portal
online help or the User's Guide in the IBM Tivoli Monitoring documentation. Any links you create should
use the Dynamic Link to 3270 query (see “Assigning the Dynamic Link to 3270 query” on page 43).
Tip: You must have Workspace Author Mode permission to create the target workspaces and the dynamic
links. If you want to share the links and workspaces you create with other users, you must have
Workspace Administration Mode permission.
Create the target workspace
Complete the following steps to create a target workspace based on the OMEGAMON for MVS - Job
Details workspace:
1. In the Navigator, open the Address Space Overview workspace.
2. In the Address Space CPU Utilization Summary table, right-click the link in the first row and select
the OMEGAMON for MVS - Job Details link from the popup menu.
The OMEGAMON for MVS - Job Details workspace opens, and a prompt asks for logon credentials for
the TN3270 session.
3. Cancel the prompt, then select Save as from the File menu.
4. In the Save Workspace As window, provide a name and description for the workspace and check Only
selectable as the target of a Workspace Link, then click OK.
Chapter 3. Using dynamic links to OMEGAMON for MVS 37
You have now created the target workspace. The terminal view in this workspace is already associated
with Dynamic Link to 3270 query, so you do not need to assign it to the terminal view. However you must
modify the script associated with the terminal view (see “Modify the associated script” on page 38).
Leave the properties window open; otherwise you will have difficulty locating the new target workspace.
Modify the associated script
In this step you modify the script already associated with the new target workspace.
You can record or write your own scripts, using the instructions in the Tivoli Enterprise Portal online help
or the User's Guide in the IBM Tivoli Monitoring documentation, but you will probably want to use the
connection and sign-on code from an existing script.
Scripts can be global or local. You can only access local scripts through the Properties window of the
terminal view with which they are associated.
The scripts associated with terminal views that are the target of dynamic links have essentially two parts:
the first part handles the connection and logon, the second half handles the navigation to a specific
OMEGAMON for MVS screen. This section of the script uses the symbol names assigned to the attributes
to fetch the values it uses to navigate to the relevant screen. This is probably the only part of the script
that you will need to modify.
To modify the script, take the following steps:
1. In the Properties window of the terminal view, select the Configuration tab (if necessary), then select
the Scripts tab at the bottom of the Configuration pane.
38 IBM OMEGAMON for z/OS: Monitoring z/OS
2. In the Scripts for IBM 3270 (24x80) list, select OMEGAMON for MVS, then select Open.
The script is displayed in an editing window.
Chapter 3. Using dynamic links to OMEGAMON for MVS 39
3. Look for the comment:
// We are at the OMEGAMON for MVS Main Menu
Edit the script after this point as necessary to reflect the navigation path and symbol names that the
script will need to retrieve from the link.
4. Select Save As from the File menu and name the modified script and specify a time-out value, then
click OK.
5. Click OK to close the Properties editor.
Now you are ready to define the dynamic link.
Define the dynamic link
In this step you create a link to the target workspace. When you define the link, you identify the target
workspace and specify the attributes whose values are passed to the script to use to navigate to the
relevant OMEGAMON for MVS screen.
Follow these steps to create a link between a source OMEGAMON for z/OS workspace and a related
OMEGAMON for MVS screen:
1. Open the workspace that you want to link from.
2. Right-click in a row in the table view from which you want to create the link and select Link to > Link
Wizard (if there are no existing links in the row) or Link Wizard (if there are existing links) from the
popup menu.
3. On the Link Wizard Welcome window, select Create a new link and click Next.
4. On the Link Name window, type a Name and Description to identify the link and click Next.
40 IBM OMEGAMON for z/OS: Monitoring z/OS
5. On the Link Type window, select Dynamic as the link type and click Next.
6. In the Navigator pane of the Target Workspace window, expand the branches and select the
Address Space Overview node. ("Cloned" workspaces are listed under the node associated with the
workspace they are cloned from; in this case, the OMEGAMON for MVS - Job Details workspace,
which is associated with the Address Space Overview node.)
7. In the Workspace pane, select the workspace to target and click Next.
8. In the Target Filters window, select Managed System Name and click Modify Expression to open the
Expression Editor.
9. In the Expression Editor, click Symbol to open the Symbols list.
10. Expand the Agent node and select Node.
Chapter 3. Using dynamic links to OMEGAMON for MVS 41
11. Click OK to close the Symbols list and OK again to close the Expression editor.
$AGENT:NODE$ appears as the expression for Managed System Name.
12. Click Next to open the Parameters window.
13. In the Parameters window, click Add Symbol and provide a symbol name for the first attribute whose
value you want to pass to the navigation script, then click OK to add the symbol to the Parameter
panel.
14. Select the symbol you just added, then click Modify Expression.
15. In the Expression Editor, click Symbol and select the name of the attribute whose value you want to
associate with the symbol you just named.
The list is a reverse hierarchy of available symbols starting from the source context and the current
Navigator item and ending at the root Navigator item.
42 IBM OMEGAMON for z/OS: Monitoring z/OS
16. Click OK to close the Symbol window, then click OK again to close the Expression editor.
The attribute name is added in the expression window.
17. Repeat steps 12–15 until you have added all the attributes whose values you want passed to the
script, then click Next.
The Summary window is displayed.
18. Review the description of the link that is going to be created. If you want to change something about
the link, click Back. If the description is correct, click Finish.
The Wizard closes, and the link is added to the view.
Assigning the Dynamic Link to 3270 query
The values returned by the query associated with the terminal view provide the information needed to
connect to the correct host, log onto the appropriate OMEGAMON for MVS session. The values passed
on the link are used to navigate to a specific OMEGAMON for MVS screen. All terminal views that are the
target of OMEGAMON on z/OS dynamic links should use the Dynamic Link to 3270 query.
Note: If you based the target workspace on an existing workspace that is the target of an OMEGAMON for
z/OS dynamic link, you do not need to assign the query. It is already assigned.
Take the following steps to assign the Dynamic Link to 3270 query to the terminal view:
1. Right-click in the terminal view and select Properties.
The Properties editor opens and the Configuration tab for the view is displayed.
Chapter 3. Using dynamic links to OMEGAMON for MVS 43
2. Select the Query tab, then click Click here to assign a query.
The Query editor opens.
3. In the left-hand Queries navigation frame, expand the MVS System entry, and then the DWL to
3270 entry and select Dynamic Link to 3270.
4. Click OK to assign the query and return to the Properties editor.
44 IBM OMEGAMON for z/OS: Monitoring z/OS
Chapter 4. Monitoring scenarios
Use these scenarios to learn how you can use the workspaces, attributes, and situations provided with
OMEGAMON for z/OS to monitor your z/OS systems and Sysplexes.
“Monitoring shared DASD” on page 45 provides examples of how you can use situations to monitor
the performance of shared DASD in your Sysplex. It includes instructions for creating situations for
filtering the collection of data for DASD devices to reduce processing overhead in environments with large
numbers of devices.
“Monitoring virtual storage and missing jobs” on page 50 contains two scenarios. The first illustrates
how you can use OMEGAMON for z/OS to monitor storage usage. The second shows you how to create a
monitoring situation to alert you to the failure of critical tasks.
“Monitoring service class goals” on page 54 shows how to monitor service class goals by creating a
situation notifies the appropriate parties when a service class missing its goals. It also shows you how to
create a threshold in a workspace table view that matches the situation parameters to help pinpoint the
problem service classes when you are doing problem analysis.
“Monitoring cryptographic coprocessors” on page 64 contains three scenarios that illustrate how you
can use how the data collected and presented by OMEGAMON for z/OS to monitor and improve your
cryptographic services.
“Detecting CPU looping address spaces” on page 68 describes the CPU Loop Index metric and how it
can be used to detect CPU loops.
Monitoring shared DASD
Because of the large DASD volume counts that have become common in recent years, monitoring DASD
devices without a filter that eliminates some of the devices can lead to high CPU or storage problems and
may even cause the monitoring server to fail. Because of these potential costs, although OMEGAMON for
z/OS provides the capability to monitor Sysplex shared DASD, it does not collect DASD device data unless
a DASD filter situation is active. An autostarted warning situation (KM5_No_Sysplex_DASD_Filter_Warn)
notifies you if no filtering situation is in place and no devices are being monitored.
You can turn DASD data collection on by running a DASD filter situation. OMEGAMON for
z/OS includes a model filter situation (KM5_Model_Sysplex_DASD_Filter), which uses the DASD
Device Collection Filtering attributes Average Response Time and I/O Rate. By customizing this
filter to exclude well behaved devices, you can enable monitoring of devices of particular
interest and avoid being overwhelmed with unwanted data. A third product-provided situation,
KM5_Weak_Plex_DASD_Filter_Warn, alerts you when too many devices are being monitored and filter
criteria should be strengthened.
Instructions are provided for creating situations for filtering the collection of data for DASD devices to
reduce processing overhead in environments with large numbers of devices. You can use Tivoli Enterprise
Portal and OMEGAMON for z/OS to monitor and manage the performance of shared DASD in your Sysplex.
The section on identifying causes of I/O delays should be of interest to all users. The section on DASD
device collection filtering should be of interest to those with administrative authority who are responsible
for configuring data collection.
Filtering collection of DASD device data
With OMEGAMON for z/OS, the best way to reduce processing overhead is to control the amount of DASD
information being sent to the Sysplex proxy for sort merge processing. Six thousand unit addresses on
each of nine LPARs in a Sysplex, for example, requires the proxy to sort merge a considerable amount of
data before the data can be evaluated. However, by creating a DASD filter situation to reduce the number
of rows of data sent to the proxy and limiting data collection to DASD devices that are performing poorly
or experiencing contention, you can dramatically reduce overhead.
© Copyright IBM Corp. 2004, 2022 45
In most cases, you create a situation to alert you to problems in the monitored system. When you create
a situation for DASD device collection filtering, you are filtering the devices that are being monitored and
identifying the devices that need further monitoring. You use the attributes in the DASD Device Collection
Filtering attribute group to create the filter situations.
When you create a situation for DASD device collection filtering, OMEGAMON for z/OS builds a list of
DASD devices based on the situation. This list is rebuilt on the DASD filter situation interval. The lower the
interval, the more overhead is incurred as Resource Measurement Facility (RMF) data is collected on all
the devices to determine if they qualify. The higher the interval, the more likely that spikes in activity on
previously inactive volumes will go unnoticed, as they were not in the monitored volume list.
If a DASD device meets the requirements in the situation, all data for the device is forwarded to the
Sysplex proxy for sort merge. If the device does not meet the requirements in the situation, the device
data is not forwarded to the Sysplex proxy. Every monitored LPAR is checked to see if the device meets
the filter criteria on that LPAR. If it does meet the criteria, the device is included for monitoring on every
LPAR so its activity can be combined over the whole Sysplex.
After you have enabled the situations, the number of devices exceeding the situation thresholds should
be no more than 500 or whatever number seems viable for your site.
Requirements and restrictions for filter situations
Situations for DASD collection filtering are subject to several requirements and restrictions.
• You must create the situation using the attributes in the DASD Device Collection Filtering group.
• For each Sysplex, you can have only one situation for DASD collection filtering. (For this reason, the one
situation must contain all the conditions for monitoring the devices.)
• You can create one situation for DASD collection filtering and distribute the same situation to more than
one Sysplex.
• The collection interval can be as small as 5 minutes, but should probably be somewhere between 15
and 30 minutes. You may want to use the same interval as you use for Resource Measurement Facility
(RMF).
Overhead for the situation is dependent on both DASD farm size and refresh interval. Big farms
or short intervals increase overhead. In establishing the collection interval, weigh the probability of
missing an important performance event. If you feel that the usage pattern of your DASD farm changes
dramatically every 10 minutes, for example, make the interval 10 minutes.
Creating a filtering situation
OMEGAMON for z/OS provides a model filter situation, KM5_Model_Sysplex_DASD_Filter, which uses the
DASD Device Collection Filtering attributes Average Response Time and I/O Rate. You can quickly create
an effective filter situation by customizing this situation to suit your site requirements.
Complete the following steps to create a filter situation based on KM5_Model_Sysplex_DASD_Filter:
1. In the Tivoli Enterprise Portal Navigator, open the Situation editor.
2. Expand the MVS Sysplex node in the navigation tree
3. Select the KM5_Model_Sysplex_DASD_Filter situation and click the Create Another Situation
button.
The Create Situation dialog box is displayed.
46 IBM OMEGAMON for z/OS: Monitoring z/OS
Figure 2. Create Situations dialog
4. Specify a name and description for the situation and click OK.
The Formula dialog for the new situation is displayed.
5. Set the threshold values for the Average Response Time and I/O Rate attributes by clicking in the row
beneath each attribute name and typing in the values.
Figure 3. The Formula tab for a filter situation
Note: Filter criteria can be specified for other columns, but these two are the only attributes that will
have a noticeable effect on refresh time and CPU consumption.
6. Use the other options of this tab to customize the situation: add additional attribute to further refine
the filter criteria, change the monitoring interval, and so on.
7. Click Apply to save the new situation.
Chapter 4. Monitoring scenarios 47
After you have customized the situation, you must distribute and start it before filtering can take effect.
Distributing the situation
After you have created a situation for DASD device collection filtering, you must distribute the situation to
the systems.
Take the following steps to distribute the situation:
1. With the situation selected in the Situation editor, click the Distribution tab.
2. Make one of the following assignments:
• To distribute the situation to a single Sysplex, in the Available Managed Systems box, click the
name of the Sysplex to which you want to assign the situation and click the left arrow.
Tivoli Enterprise Portal moves the name of the Sysplex to the Assigned box.
• To distribute the situation to all Sysplexes, select *MVS_SYSPLEX in the Available Managed
Systems List box, and click the left arrow to add the managed system list to the Assigned box.
3. Click OK to distribute the situation.
Starting and stopping the situation
You can start and stop the situation using the Situations dialog box.
1. In the left-hand Situations frame in the Situation editor, right-click the name of the situation.
2. Select the appropriate option from the pop-up menu:
• To start the situation, click Start.
• To stop the situation, click Stop.
If you want the situation to run continuously across Tivoli Enterprise Portalrestarts, check Run at startup
on the Formula tab.
Displaying messages for the situations you create
OMEGAMON for z/OS provides messages for DASD device collection filtering. You can display these
messages in the RKLVLOG on the Tivoli Enterprise Monitoring Server on the host. The messages for
Sysplex situations have the prefix KOS.
Example situations
The following are examples of situations that filter DASD device collection.
Filtering for devices that are busy
The following situation limits data collection to devices that have
• Some activity
• Long average response times
Average response time GT 70.0 AND
I/O Rate GT 2.0
Filtering for performance problems on critical volumes
The following situation limits the data collection to critical database devices that have
• A volume serial name that begins with DB2 or contains IMS
• Some activity
• Slow response times
48 IBM OMEGAMON for z/OS: Monitoring z/OS
(Average response time GT 70.0 AND I/O Rate GT 2.0 AND Volume Serial Number EQ DB2*) OR
(Average response time GT 70.0 AND I/O Rate GT 2.0 AND Volume Serial Number EQ *IMS*)
Using shared DASD data collection to identify the cause of I/O delays
OMEGAMON for z/OS provides three Sysplex-level DASD related workspaces:
• The Shared DASD Groups Data for Sysplex workspace displays information on device contention and
usage for all the groups in a Sysplex. This information can help determine how equitably a device is
serving all systems in the Sysplex.
• The Shared DASD Devices workspace displays statistics for the individual devices in a selected group.
The Shared DASD Devices workspace displays information about the activity of the shared devices for
a group, averaged over all systems in the Sysplex. This information can help determine how equitably a
device is serving all systems in the Sysplex.
• The Shared DASD Systems workspace displays information about the systems that share a device. This
information helps you measure the performance and exceptions from the perspective of each system.
The following scenario illustrates how you can use these workspaces, in conjunction with a monitoring
situation designed to alert you to device contention, to identify the devices and data sets responsible for
significant I/O time or delays for important workloads in your sysplexes.
Getting situation event notification
You are looking at the Tivoli Enterprise Portal and see that the z/OS Systems Navigator icon is overlaid by
a Warning event indicator.
You move your cursor over the icon to see a flyover list of situations that are currently true for your
mainframe systems. The Sysplex_DASD_Dev_ContIdx_Warn situation, which you activated to monitor
DASD device contention in your Sysplexes, is listed in the flyover.
Glancing down the expanded z/OS Systems tree, you notice warning indicators for the Service Classes
Data for Sysplex and the Shared DASD Group Data for Sysplex items in the SYSPLEX1 tree, which leads
you to suspect the problem is in the distribution of I/O activity among your DASD volumes.
Analyzing I/O distribution
The Sysplex_DASD_Dev_ContIdx_Warn situation has alerted you to the fact that DASD device contention
index has reached a level that warrants attention. Now you want to find information about the device or
devices causing the contention.
In the Navigator, you select Shared DASD Groups Data for Sysplex and the default workspace is
displayed. To determine if I/O activity is unevenly distributed among the devices, you examine the
statistics in the following list:
• The average true busy percentage of the group. If, for example, the value ranges between 1.1 at the
lower end and 60.2 at its highest, that means that on average, devices in the group spend 1.1% of their
time doing work for all systems; at its busiest, one device is spending 60.2% of its time doing work for
all systems.
• The average device contention for the group. For example, the value of the average contention index
might be 0.75, and the highest device contention index of the group 1.5. A high number (more than
1.0) means that I/O requests for a device are substantially delayed because of contention for the
device or path generated by other systems. You would like to lower the contention index for a device in
contention.
Examining this information allows you to determine which group or groups of devices is experiencing
uneven I/O distribution, but you need information about the specific devices involved.
Chapter 4. Monitoring scenarios 49
Isolating the problem
To get information about the devices in a group you have identified, you click the link icon by the link
name in the Shared DASD Groups table view to link to the Shared DASD Devices workspace for that group.
The Shared DASD Devices workspace displays statistics for each device in the group. With the information
in this workspace, you can determine if a device is not serving all systems in the Sysplex equitably. You
determine this by examining:
• True percent busy. If this value is too high for one device, while other devices have a lower value, the
workload should be balanced between the devices.
• Contention index. If this value is too high, this is an indication that I/Os for the device are substantially
delayed due to contention for the device or path generated by other systems.
• System response time. A high value in this column indicates an inordinate amount of time is required for
this device to process I/O activity.
• Cumulated I/O rate. This indicates the device is not processing I/Os at an acceptable rate.
To view additional information for a specific device to help you determine if it is causing I/0 delays, you
click the link link button next to the volume serial number in the Shared DASD Devices table. The link
takes you to the Shared DASD Systems workspace, which shows:
• The systems that share the device and indicates which systems have a high response time and high I/O
rate.
• Performance measures and exceptions presented for each system and indicates how the device is
performing for each system.
Based on this information, you can identify which devices are causing excessive I/O delays.
Taking action to resolve a shared DASD problem
From the information you have gathered, you decide to implement one of the following solutions.
• Redistribute the work among devices to eliminate the contention for resources.
• Reschedule the work for a single device to a time when device contention is usually low.
Monitoring virtual storage and missing jobs
These two scenarios, “Monitoring paging and virtual storage” on page 50, illustrate how you can use
OMEGAMON for z/OS to monitor storage usage; “Monitoring critical started tasks” on page 52 illustrates
how to create a monitoring situation to alert you when critical tasks fail.
Monitoring paging and virtual storage
The maximum amount of system virtual storage is bounded by the amount of real storage plus paging
space limits, so paging space performance and availability become a vital factor in the effective execution
of applications. With the introduction of 64-bit or large storage objects, the memory requirements of
paging became higher and storage can become exhausted more readily. When usage approaches 30%,
paging efficiency begins to decline, and blocked paging disappears at about 35% occupancy. Severe
problems can occur if page space used is greater than 85%. Because the percentage of paging space
increases as well as paging rate, it is a good indicator that a problem may be ensuing.
The following scenario suggests how you can use the resources provided byOMEGAMON for z/OS to
detect and analyze paging and storage problems.
Activating the OS390_Local_PageDS_PctFull_Crit situation
To alert you to impending problems and to simplify analysis when problems occur, activate the
OS390_Local_PageDS_PctFull_Crit predefined situation. This situation monitors to determine whether
the percentage of slots in use on a local page data set is greater than or equal to 35% and issues a
50 IBM OMEGAMON for z/OS: Monitoring z/OS
Critical alert if this condition is found to be true. (See “Activating situations” on page 13 for instructions
for activating the situation.)
You can also modify this situation to include page rate as an additional indicator that the paging system
performance is becoming impacted.
Using the Page Dataset Activity workspace to gather information
If you see an event indicator alerting you that the OS390_Local_PageDS_PctFull_Crit condition is true,
you can start investigating the problem using the Page Dataset Activity workspace for the affected
system.
This workspace provides information about availability and response time for a specific page data set.
Page data sets are auxiliary storage data sets that back up all frames of virtual storage. They must be
large enough to contain all common and private virtual storage. Page data sets are used when an address
space references data that is not in either real or expanded storage. The process of bringing in data is
called a page-in and is coordinated by the auxiliary storage manager (ASM). If swap data sets are not
defined, page data sets also contain the swapped out part of an address space.
Because the process of paging is very slow when compared to referencing data from real or expanded
storage, it is important that page data set devices be isolated from contention with other kinds of work.
This is especially true if there is contention for real and expanded storage, and the page fault rate is high.
The Percent Full and Response Time bar charts in this workspace provide visual representations of the
availability of space in the various types of page data sets and the response times for those data sets.
Even if page rates are low, data sets over 35% percent full can indicate a performance issue is developing
and some action may be required.
Your next step is to decide what action to take to resolve any problems. Rather than simply adding
another data set, you can use the Address Space Storage workspace to evaluate if there are any jobs
which can be trimmed or moved to a different system or time slot to balance out system resources.
Examining storage usage
You can link to the Address Space Storage workspace from the Address Space Counts table of the
Address Space Overview workspace.
In the workspace, the Fixed Storage bar graph shows who the heavy users of fixed storage are. The
Virtual Memory table provides data on fixed and virtual low, extended, and large storage use. This
information can be used as an application tuning tool as well as system performance tuning tool.
Check for applications using large percentages of the storage memory limits. This information can be used
in deciding how to manage those applications later.
Evaluating your options
Consider the following options to address any paging problems you have detected:
• Increase paging to align with potential virtual storage demand
This would mean increasing the size of paging space by adding more local page data sets or increasing
the size of existing ones.
• Redistribute larger applications to other LPARs
You can redistribute large applications or applications using a lot of storage to other systems where the
workload is lighter or paging demand lower. Alternately or in addition, you could rebalance large jobs at
different times so that they do not run concurrently and compete for virtual storage or paging space.
• Decrease the applications memlimit to reduce storage demand on system
The viability of this option depends on your application service requirements, as such a change could
directly impact performance in the application. This could also depend on the behavioral effects on the
application, as some applications may not be able to effectively function with lowered memory limits.
Chapter 4. Monitoring scenarios 51
Monitoring critical started tasks
In most environments, there is a set of started tasks, such as CICS® tasks or WebSphere® tasks, that
should always be running. This scenario shows you how to define a situation that will alert you if one
or more of these task fails. The situation is based on the Job Name attribute of the Address Space CPU
Utilization attribute group and uses the Check for Missing Items function.
Creating the MissingTaskAlert situation
Note: This scenario assumes that you are already familiar with the basic steps for creating a situation. If
you are not sure of the steps, see the Tivoli Enterprise Portal online help or the Tivoli Enterprise Portal
User's Guide in the IBM Tivoli Monitoring documentation
In this scenario, you create a situation to monitor two tasks, CICS1 and WEB5, running on SYSTEM1 in
SYSPLEX1.
1. Click the Situation Editor icon at the top of the screen, and then click MVS System.
2. In the Situation editor, create a new situation.
3. In the Create Situation dialog box, type a name and description for the situation, for example,
MissingTaskAlert.
52 IBM OMEGAMON for z/OS: Monitoring z/OS
4. In the Select Attribute dialog box, select the Address Space Real Storage attribute group and
the Address Space Name attribute, and then click OK.
5. In the expression editor, set the cursor in row 1 under Address Space Name; click , and then click
Check for Missing Items.
Chapter 4. Monitoring scenarios 53
6. In the Missing Item popup, enter your list of critical tasks, and then click OK.
7. If you want to make the situation trigger independently for each job name:
a. Click Advanced.
The Advanced Situation Options dialog box appears.
b. Click the Display Item tab, and then select Address Space Name as the display item.
This is especially helpful if you want to attach an action to the situation, such as a start command.
c. Click OK.
8. Complete the situation, for example by selecting a different sampling interval, adding any advice or
instructions you want to provide, adding a Start command to restart the task, or distributing it to other
systems you want it to run on.
Remember to stop the situation if you migrate the task that it is monitoring.
Monitoring service class goals
This scenario illustrates how you can use OMEGAMON for z/OS resources to monitor service class goals.
The scenario also provides step-by-step instructions for creating a situation, defining a reflex action using
a TSO command, and setting a threshold in a table view.
The scenario
Your DB2 and IMS transaction servers service applications that are important to your business operation.
You run these address spaces on your SYSTEM1 z/OS system under the STCONLN Service Class.
You want to use OMEGAMON for z/OS to monitor this service class and notify the appropriate parties if the
service class is missing its goals. You also want to use OMEGAMON for z/OS to determine why the service
class is missing its goals if that happens.
To set up monitoring, alerting, and analysis, you perform the following tasks using the Tivoli Enterprise
Portal:
• Create a situation that raises a Critical alert when a service class is not meeting its goals.
• Define an action for the situation that will cause a system command to be executed at the z/OS system
associated with the raised situation
• Modify the WLM Service Class Resources to provide a column threshold that matches the situation
parameters. This will help pinpoint the problem service classes when doing problem analysis.
54 IBM OMEGAMON for z/OS: Monitoring z/OS
Creating the zOS_Critical_SvcClass_Missed_Goal situation
Your first step is to create a situation that raises an event and notifies the appropriate personnel when a
service class misses its goal.
You create a new situation named zOS_Critical_SvcClass_Missed_Goal using the Situation editor. This
situation raises a Critical event indicator and sends a TSO message to a designated system
administrator when the Performance Index for a service class period is greater than 1.5. You set the
situation to start whenever the Tivoli Enterprise Portal Server starts (autostart).
Because you access the Situation editor from the WLM Service Class Resources Navigator item for
SYSTEM1, this situation is automatically associated with that item and a Critical event indicator will
appear on the item when the situation is true.
To create the zOS_Critical_SvcClass_Missed_Goal situation, complete the following steps:
1. Navigate to system SYSTEM1 in the Navigator and expand the item, if necessary.
2. Right-click the WLM Service Class Resources item for SYSTEM1 and select Situations from the
popup menu.
The Situation editor opens. The WLM Service Class Resources node and any associated situations are
displayed in the left-hand frame.
3. Click the Create New Situation icon.
The Create Situation dialog box appears.
4. Provide a name and description for the situation, and then click OK.
Chapter 4. Monitoring scenarios 55
The Select attribute dialog box appears, with the WLM_Service_Class_Resources group selected.
5. Scroll down the Attribute Item list and select the Performance Index attribute, and then click OK.
The Formula tab for your new situation is displayed with the Performance Index attribute added to
the expression editor.
56 IBM OMEGAMON for z/OS: Monitoring z/OS
6. Create the situation expression:
a. Click in the first row of the Performance Index column.
b. Click the relational operator button and select > Greater Than from the popup menu.
c. Type a value of 1.5.
Chapter 4. Monitoring scenarios 57
7. Accept the default sampling interval of every 15 minutes, or set the sampling interval that suits your
monitoring requirements.
8. By default, Run at startup is selected. This means that after the situation has been distributed, it
will start whenever the Tivoli Enterprise Monitoring Serveris started. If you want to start and stop the
situation manually, for example to test the effects of the selected sampling interval, deselect Run at
startup.
9. Click Apply to save the situation properties you have defined so far.
10. To generate a separate action message for each service class name, make Service Class a display
item:
a. Click the Advanced... button.
The Advanced Situation Options dialog box appears.
b. Select the Display Item tab, and then use the dropdown menu for the Item field and select
Service_Class.
c. Click OK to close the Advanced Situation Options dialog box.
Defining the Take Action command
You want to use a TSO command to send a message to the system administrator whenever a service class
misses its goal. You want to include in the message the name of the service class and the value of the
Performance Index attribute at the time the situation became true. You accomplish this by adding a Take
Action command to the zOS_Critical_SvcClass_Missed_Goal situation.
Perform the following steps to define the Take Action command:
1. With the zOS_Critical_SvcClass_Missed_Goal Properties window open, select the Action tab.
58 IBM OMEGAMON for z/OS: Monitoring z/OS
2. In the System Command text field, type the following command, using attribute substitution for
service_class and performance_index value.
SEND 'service class service_class_name is missing goal. Performance Index is
performance_index’,USER=(userid)
3. If you want separate notification for every monitored item for which the situation is true, select Take
action on each item.
4. Set the command to run at the Tivoli Enterprise Monitoring Server.
Chapter 4. Monitoring scenarios 59
5. Click Apply to save the command.
Setting thresholds in the WLM Service Class Resources workspace
Now you want to set thresholds in the WLM Service Class Resources workspace that will mirror the
condition you defined in the situation. This will allow you to analyze problems more easily. You decide that
you also want to create a threshold that will produce a warning indicator when a service class is nearing
the critical threshold. You accomplish these goals using the Properties editor for the WLM Service Class
Resources table to add these thresholds to the Performance Index column.
To set the thresholds:
1. Select the WLM Service Class Resources item for SYSTEM1 in the Navigator.
The default workspace is displayed.
2. Right-click in the WLM Service Class Resources table view and select Properties from the popup
menu.
60 IBM OMEGAMON for z/OS: Monitoring z/OS
The Properties editor opens.
3. In the Properties editor, select the Thresholds tab.
Chapter 4. Monitoring scenarios 61
4. To create the critical threshold:
a. Scroll in the Thresholds editor until you see the Performance Index attribute.
b. Click in the first row beneath the column heading. The critical (red) indicator is already selected.
c. Select a relational operator of Greater Than (GT) , then type a value of 1.5.
5. To create a warning threshold:
62 IBM OMEGAMON for z/OS: Monitoring z/OS
a. In the second row, click the alert indicator selector next to the row number and select the warning
(yellow) indicator.
b. Click in the second row of the Performance Index attribute and select the Greater Than (GT)
relational operator and type a value of 1.0
.
c. Click OK to save the thresholds and close the editor.
Analyzing a problem
When the zOS_Critical_SvcClass_Missed_Goal situation is true on a monitored system, you are notified
that a service class is not meeting its goal. Now you need to find out why this condition has occurred.
In Tivoli Enterprise Portal, you move your mouse pointer over the event indicator on the
Enterprise icon in the Navigator to see the Event flyover. In the flyover, you right-click the
zOS_Critical_SvcClass_Missed_Goal situation and select Acknowledge from the popup menu to create
an acknowledgment to let the operators monitoring the situation in the data center know you are working
on the situation. In the flyover, you click the link icon next to the service class situation to open its event
workspace. You compare the initial situation values and the current situation values to see if the high
performance index value is persisting. Since it has taken you several minutes to respond to the situation
notification, and the value is staying high, you decide to analyze the problem further to see if you can
prevent this problem from arising.
You navigate to the WLM Service Class Resources workspace for the affected system to examine the
performance information. The thresholds you previously set in the Performance Index column help you
pinpoint the problem service class periods, in this scenario a Critical indicator identifies the STCONLN
service class. You note the goal importance for the service class in case an adjustment is required to
address the problem. In addition, you examine the overall performance of service classes with less
important goals to determine if they can be adjusted to allow more resources to be available to the more
important service classes.
Chapter 4. Monitoring scenarios 63
To get more information about the problem, you use the Navigator to display the Sysplex-level Service
Classes Data for Sysplex workspace. From there you link to the Address Spaces Workspace for Service
Class for the STCONLN service class. In this workspace you can examine data for all the address spaces
in the service class. For service classes like STCONLN with a goal type of Velocio, CPU is the primary
resource required for meeting this goal. You note the address spaces with the highest CPU utilization, as
these may require further investigation. Sorting the table by the CPU Percent column helps you identify
the address spaces with highest CPU usage. You also examine the address space list to verify that there
are no unexpected address spaces.
Stepping back to the Service Classes Data for Sysplex workspace using the backward navigation
arrow, you link to the Workflow Analysis Workspace for Service Class to determine the greatest resource
impactor for the STCONLN service class. In this case, you determine the major bottleneck for the service
class or its associated address spaces is Waiting on CPU, so you want to examine the performance
information for the LPAR and central processing complex where the workload is running.
You use the LPAR Clusters workspace to examine the performance information for the LPAR and CPC
where the workload is running. You examine the performance parameters listed below to determine a
path for problem resolution:
• CPU % Index: A value of 1 or greater indicates that the actual LPAR physical processor utilization meets
or exceeds the configured targets. A value less than 1 indicates that the LPAR is not able to obtain the
resources it is targeted to obtain (based on its defined weight).
• Effective Weight Index: A value of 1 or greater indicates that the ability of the LPAR to obtain logical
processor resource meets or exceeds the defined targets. A value less than 1 indicates that the LPAR is
not able to obtain the resources it is targeted to demand (based on its defined weight).
• Current, Initial, Minimum and Maximum Weights for the LPAR: In a shared physical processor
configuration, the LPAR WEIGHT determines the relative importance of the LPAR for the allocation
of processor resources. In an IRD configuration, the weights will be adjusted within the maximum and
minimum bounds.
• CPU % Ready: Indicates the percent of time that the LPAR had “ready” work and was not dispatched
(for example, because no processors are available).
• LPAR Capping Status: Indicates if “capping” is defined for the LPAR. LPAR will prevent an LPAR from
obtaining processor resource even when other LPARs are not using available resources.
In a Sysplex configuration, the performance of a given service class should be examined in all the LPARs
where the service class workload is running. This will help balance any performance adjustments that
may be implemented to resolve the problem.
Monitoring cryptographic coprocessors
Three scenarios that illustrate how you can use the data collected and presented byOMEGAMON for z/OS
to monitor and improve your cryptographic services.
• “Validating your cryptography configuration” on page 64
• “Monitoring and improving cryptography performance” on page 65
• “Monitoring and improving cross-system ICSF performance” on page 68
Validating your cryptography configuration
Many cryptography problems are the result of configuration errors such as failure to assign coprocessors
to the z/OS system, offline or unavailable coprocessors, disabled public keys, or invalid master keys. This
scenario illustrates how you can use OMEGAMON for z/OS to check your cryptography configuration and
correct any errors you discover.
To check the configuration of your coprocessors:
1. In the Navigator, expand the item for a system in a Sysplex and scan the tree for the Cryptographic
Coprocessor entry:
64 IBM OMEGAMON for z/OS: Monitoring z/OS
Enterprise
Windows Systems
z/OS Systems
SYSPLEX1:MVS:SYSPLEX
Coupling Facility Policy Data for Sysplex
Coupling Facility Structures Data for Sysplex
...
Sys1
MVS Operating System
DEMOPLX:SYS1:MVSSY
Address Space Overview
Channel Path Activity
Common Storage
Cryptographic Processors
...
Sys2
SYSPLEX2:MVS:SYSPLEX
2. Select Cryptographic Coprocessors to display the default Cryptographic Services workspace.
3. Check the data in the ICSF Subsystem Status view to make sure that
• The ICSF subsystem is configured correctly.
• Master keys are loaded and set correctly.
• Coprocessors are online and active.
• Cryptography services are operational.
4. If several ICSF subsystems are installed on images that share coprocessors, and a monitoring agent
is installed on each subsystem, inspect the values for each subsystem. Also, be sure to check cross-
system ICSF performance (see “Monitoring and improving cross-system ICSF performance” on page
68).
Recheck the Cryptographic Coprocessor workspaces after any changes or adjustments to the
cryptography configuration.
Monitoring and improving cryptography performance
The cryptographic coprocessor data collected byOMEGAMON for z/OS helps you make load-balancing
decisions to improve cryptography performance.
Cryptography performance monitoring on each Integrated Cryptographic Service Facility (ICSF)
subsystem has two main components.
• Service call performance monitoring, which involves gathering data such as arrival rate of service
requests, time to complete each service call, and queue lengths.
• Top user performance monitoring, which involves determining which job names are the heaviest users
of cryptography services.
Chapter 4. Monitoring scenarios 65
Checking service call performance
Use the Service Call Performance workspace to evaluate how well service requests are being handled.
You can link to this workspace from the Service Call Performance table of the Cryptographic Services
workspace for a particular system (see steps 1 and 2 of “Validating your cryptography configuration” on
page 64), or from the Service Call Performance by System table of the Cross-System Cryptographic
Coprocessors Overview workspace. Starting from the Cross-System workspace allows you to gain an
overview of performance and to quickly check performance details for several systems in succession.
Take the following steps to review service requests, starting from the Cross-System Cryptographic
Coprocessors Overview workspace:
1. Select the z/OS Systems item in the Navigator.
The Sysplex Enterprise Overview workspace is displayed.
2. Right-click the z/OS Systems Navigator item and select Workspace > Cross-System Cryptographic
Coprocessors Overview from the popup menu.
The Cross-System Cryptographic Coprocessors Overview workspace is displayed.
3. In the Service Call Performance by System table view, click the link icon beside a system SMFID.
The Select Target box is displayed.
4. Select the Cryptographic Coprocessors node for the system for which you want data.
The Service Call Performance workspace is displayed.
5. In the Service Call Performance workspace:
• Check the Average Arrivals per Minute bar chart to see which services are being called most
frequently.
• Check the Average Service Time per Call bar chart to see which services are taking the longest time
to complete.
66 IBM OMEGAMON for z/OS: Monitoring z/OS
Service calls that are taking the longest time to complete, but are not arriving frequently, probably do
not pose a performance problem. Performance problems tend to occur when a particular service call
both arrives frequently and takes a long time to complete.
• Check the Average Pending per Call bar chart for queue length.
A high number of pending requests for a particular service call would indicate a performance
problem.
• Use the Average Bytes per Service Call bar chart to see whether the service calls with the largest
number of bytes also have the highest service times.
Relatively high services are to be expected for calls that have the largest number of bytes.
Note: Byte counts are available for some but not all service calls. Therefore, the data shown in the
Average Bytes per Service Call bar chart are correct but incomplete. For a list of the service calls for
which byte counts are available, see the online help.
6. If any of the statistics displayed in the bar charts do not seem to make sense or if you need more
information about a service call’s performance, examine detailed data in the Service Call Performance
table. For explanations of all the attributes in this table, see the online help.
7. To check service call performance on another system, use the Backward navigation button (on the
Tivoli Enterprise Portal toolbar in desktop mode) or the Back button of your browser to return to the
Cross-System Cryptographic Coprocessors Overview workspace, and then repeat steps 3–6 to view
other systems.
Checking top user performance
Use the Top User Performance workspace to learn which jobs are the heaviest users of cryptography
services.
You can link to this workspace from the Cryptographic Services workspace for a particular system (see
steps 1 and 2 of “Validating your cryptography configuration” on page 64) or from the Top Users by
System table in the Cross-System Cryptographic Coprocessors Overview workspace (see steps 1 and 2 of
“Checking service call performance” on page 66).
In the workspace, check the bar charts in the Top User Performance workspace to see which jobs
• Are requesting cryptography services most frequently.
• Have the highest average service time.
• Are waiting longest for their requests to move to the top of the queue.
• Are requesting services with the highest byte counts.
Note: Byte counts are not available for all service calls. Therefore, the data shown in the Top 10 Average
Bytes per Call bar chart are correct but incomplete. For a list of the service calls for which byte counts are
available, see the online help.
If you need more information about top user performance, you can find detailed data in the Performance
by Top Users table. One particularly useful piece of information in this table is the LastSvcDesc column,
which shows the service requested most recently by each of the top users. Click the refresh button
several times and see whether the same service call keeps showing up. If so, that service call is being
used heavily by the top users and may be implicated in any performance problems that arise.
Improving performance
Some suggestions for improving performance
• If service times are unacceptably high, you might consider decreasing the strength of cryptography by
reducing the length of the key. Conversely, if you need to increase the strength of cryptography, you can
observe and weigh the performance risk.
• If any of the top job names are relatively unimportant, you might want to reduce their priority on the
system.
Chapter 4. Monitoring scenarios 67
• For the most frequently called services and the services with the highest normal service times, you
might want to create your own situations. In each situation, specify combinations of arrival rate and
service time that you consider worrisome (warning) or truly unacceptable (critical). Whenever a service
call reaches a specified threshold, an event alert will be posted on Tivoli Enterprise Portal. You can then
take immediate action to correct the problem.
For instructions on creating situations, see the Tivoli Enterprise Portal online help.
Monitoring and improving cross-system ICSF performance
If several ICSF subsystems are installed on z/OS images that share coprocessors in a Sysplex or
Processor Resource/System Manager (PR/SM) complex, the workloads on each subsystem can affect
the performance of the other subsystems. Use the Cross-System Cryptographic Coprocessor Overview
workspace to compare the subsystems’ cryptography performance and to troubleshoot performance
problems.
Access the Cross-System Cryptographic Coprocessor Overview workspace as described in steps 1 and 2
of “Checking service call performance” on page 66.
To check cross-system performance:
• Check the ICSF Subsystems by System table to make sure the coprocessors are online and
cryptography services are active on all systems.
The predefined situation Crypto_No_PCI_Coprocessors defines the lack of an online PCI coprocessor as
a warning condition. A matching threshold has been set for the 1 PC1 column. So when this condition
occurs, warning event indicators appear in the table and in the Navigator.
If you find configuration or availability problems, correct them immediately.
• Check the Service Call Performance by System table to determine the average request arrival rate,
service time in milliseconds, queue length, and byte count per service call for each system.
If all service calls have been processed on a single system, but service time is well under a millisecond,
and the Pending column shows no request queue, there is no performance problem. However, if the
same system continues to be used exclusively and the arrival rate increases, service time and queue
length would also increase, and a serious performance problem might develop. In such a case, you
might consider rebalancing workloads among systems to correct the problem.
• Check the Top Users by System table to see whether one or two job names seem to be monopolizing
cryptographic services.
If any of the top job names are relatively unimportant, you might want to reduce their priority on the
system.
Look at the LastSvcDesc column, which shows the service requested most recently by each of the top
users. Refresh the workspace several times over various intervals and see whether the same service call
keeps showing up. If so, that service call is being used heavily by the top users and may be implicated in
performance problems. See “Improving performance” on page 67 for suggestions for improving service
call performance.
Detecting CPU looping address spaces
Address spaces can occasionally fall into a CPU loop where a task executes instructions endlessly.
Looping is usually an unproductive event, using CPU resources that could be better spent on other
workloads. Looping may be a symptom of a storage corruption within the application or perhaps a design
flaw that did not anticipate some rare set of environmental circumstances. Whatever the cause of a loop,
it has proven difficult in the past to detect that an address space has begun to loop.
On the surface, detecting a CPU loop should be simple: just look for an address space that is using 100%
CPU. On z/OS, this is not a good strategy. Most logical partitions (LPARs) are defined with several logical
processors (LPs) that can each run instructions for different dispatchable units such as tasks, enclaves,
and service request blocks (SRBs). Under these circumstances, does 100% CPU refer to 100% of a single
68 IBM OMEGAMON for z/OS: Monitoring z/OS
LP or 100% of all the logical processors in the LPAR? Typically, a looping address space is consuming a
single logical processor.
Another confounding factor is the z/OS dispatching algorithms, including Workload Manager (WLM), which
actively try to distribute processor resources appropriately among all the competing dispatchable units.
These algorithms interrupt a looping job to dispatch other work that is not getting CPU resource which
it is due under the service policy defined. Eventually, the looping job gets redispatched and squanders
more CPU. But because of the interrupts, the job’s measured CPU% will drop. Often looping jobs will be
parked by the system policy so much that their measured CPU percentage will be too low to be detected
by simple threshold settings.
Another popular strategy for detecting CPU loops is to look for address spaces with high CPU usage and
low or no I/O activity. But as already noted, it is not easy to define what “high CPU usage” means; can it
be said with confidence that a job using little or no I/O is clearly misbehaving? What about an application
that has much of its working data cached in memory so that it can be more responsive to transactions?
This is a good performance strategy, but it means that the application will also present a profile of low or
no I/O activity.
OMEGAMON for z/OS offers a metric, CPU Loop Index, which is designed to overcome these issues and
make detecting CPU loops an easier task. The purpose of this metric is to characterize the intent of an
address space to use the CPU. Looping jobs will show an unrelenting intent to use CPU to the exclusion
of any other resource. Even when they are parked by WLM or other z/OS policy actions, their intent to use
the CPU can be detected.
The calculation of the CPU Loop Index is done over 10 or more minutes for workspace reporting.
For situations, the calculation are done over either 10 minutes or the refresh period of the situation,
whichever is larger. There is no need to use persistence. Service classes determined to be of low
importance automatically use longer periods of calculation to avoid false positive indications.
Determining the intent of a job
The OMEGAMON for z/OS Bottleneck Analysis feature samples the execution state of every address
space in z/OS every few seconds. An address space typically occupies one execution states at a time.
These execution states are things like Using I/O, Waiting for I/O, Waiting for Enqueue, Waiting for HSM
Recall, Swapped MPL Delay, Using CPU, and Waiting for CPU. OMEGAMON for z/OS defines more than 60
execution states. The execution states that indicate intent to use CPU are Using CPU, CPU Wait, Using IFA,
IFA Wait, Using zIIP, and zIIP Wait.
On every sample taken by Bottleneck Analysis, each address space is classified into one of these
execution states and a count is added to the bucket for this execution state. Over time, the profile of
execution states builds up for every address space. If you divide the count in any bucket, for example the
Using I/O bucket, by the total count summed over all buckets, you get the percentage of samples where
the address space was found using I/O. If you sum the counts in all the “intent to use CPU” buckets and
divide by the total count, you get the CPU Loop Index. Address spaces in a CPU loop have counts almost
exclusively in the “intent to use CPU” buckets. This means the index will likely be at or very near 100%.
Note: z/OS systems have three processor types that can be used by normal applications. The first type
is the standard general processor, which is commonly referred to as a CPU in OMEGAMON for z/OS. The
second processor type is the System z Application Assist Processor, which is usually referred to as a
zAAP processor. In its early days this processor was called the Integrated Facility for Applications, or
IFA. OMEGAMON for z/OS still uses this terminology. zAAP processors are used to execute Java™ Virtual
Machine processes. The third processor type is the System z Integrated Information Processor, or zIIP.
zIIPs are most often used to execute DB2 processes, but can be used by other specialized SRB processes.
The KM5_CPU_Loop_Warn situation
OMEGAMON for z/OS includes a model situation, KM5_CPU_Loop_Warn, which can be distributed to all
monitored LPARs and started automatically. This situation looks for jobs with a CPU Loop Index higher
than 95%. The situation takes a sample every five minutes looking for this condition and requires that two
consecutive samples show the condition to be true before it raises an alert.
Chapter 4. Monitoring scenarios 69
The requirement that two consecutive samples meet the condition before an alert is raised is intended to
minimize false positive results. The probability of false positives is greatly reduced by extending the time
period over which an address space’s behavior is watched when that address space has a high percentage
of CPU Wait time.
A high CPU Loop Index value with much of the index due to CPU Wait time indicates that an address space
does not have very high importance to WLM. In this case, the address space is watched for longer time
periods to see if it continues to exhibit looping behavior or not. Bottleneck data for address spaces such
as these is accumulated for up to 120 minutes to see if the CPU Loop Index continues to be high. This
helps avoid raising an alarm for address spaces that are just starved for attention.
The thresholds established for this false positive avoidance are:
• The address space must have a CPU Loop Index value higher than 90% as seen by the monitoring agent.
• If the address space’s CPU Wait time contributes between 50% and 75% of the total CPU Loop Index,
the address space is watched for up to 60 minutes before its high index value is reported. The value
reported during this waiting period will grow towards 90% but will not exceed that threshold.
• If the address space’s CPU Wait time contributes more than 75% of the total CPU Loop Index, the
address space is watched for up to 120 minutes before its high index value is reported. The value
reported during this waiting period will grow towards 90%, but will not exceed that threshold.
Investigating and identifying looping jobs
This scenario involves three jobs that show a CPU loop: BKEALCP1, BKEALCP2, and BKEALCP3. This
scenario also has three other jobs that run high CPU for a while, but then run I/O activity for a while and
then return to the CPU: BKEALIO, BKEALIO2, and BKEALIO3. These jobs represent high CPU usage jobs
that are not in a CPU loop.
Using normal ISPF SDSF screens, sorted for CPU% ascending, you can see that these jobs are the most
CPU intensive work in the LPAR (Figure 4 on page 71). Each of the sample looping jobs is using 15.81%
of the LPAR’s CPU. The I/O jobs are using about 4.33% each.
70 IBM OMEGAMON for z/OS: Monitoring z/OS
Figure 4. Sample jobs
It is not likely that a performance analyst looking at this report would expect that any of these jobs
was in a CPU Loop. The KM5_CPU_Loop_Warn situation, however, does see BKEALCP1, BKEALCP2, and
BKEALCP3 as suspect jobs and an event is raised in the Tivoli Enterprise Portal(Figure 5 on page 72).
Chapter 4. Monitoring scenarios 71
Figure 5. Event flyover for KM5_CPU_Loop Warn
The situation event workspace for KM5_CPU_Loop_Warn (Figure 6 on page 73) indicates that BKEALCP1,
BKEALCP2, and BKEALCP3 have very high CPU Loop Index values. The accompanying expert advice notes
that a high CPU Loop Index is not a guarantee that the job is looping and it is possible that a well behaved
job is in a normal period of intensive CPU activity. It is up to the site analyst to recognize jobs that
normally run long periods of CPU instructions. Some scientific workloads may fit this profile, but it unlikely
that a normal business workload would use this much CPU for this long (10 minutes at minimum). Users
are advised to examine the address spaces.
72 IBM OMEGAMON for z/OS: Monitoring z/OS
Figure 6. Situation event workspace for KM5_CPU_Loop_Warn
A number of OMEGAMON for z/OS workspaces can be used to examine these jobs further, starting at the
Address Space Overview workspace (Figure 7 on page 73).
Figure 7. Address Space Overview workspace
Chapter 4. Monitoring scenarios 73
The six jobs of interest appear first in the Address Space CPU Utilization Summary view. In this view,
the rows are sorted in descending order of CPU Percent so that no higher CPU users are missed on
subsequent view pages. To look at the bottleneck analysis for these jobs to see which execution states
they are spending their time in, you right-click the icon in the Address Space Counts view and select
the Address Space Bottleneck Summary link.
The Address Space Bottleneck Summary workspace (Figure 8 on page 74) displays the CPU Loop Index
value for the three suspect looping jobs. You notice that the jobs that are using heavy CPU but are not
looping, BKEALIO, BKEALIO2, and BKEALIO3, all have significantly lower indexes. When you look at the
other bottleneck data you see that the Using CPU values by themselves are not unequivocal indicators
of a loop. It is only when you combine the Using CPU and CPU Wait numbers that you see the dramatic
effect.
Figure 8. Address Space Bottleneck Summary workspace
To explore job BKEALCP3 in more detail, you use a dynamic link to drill down to OMEGAMON for MVS. You
right-click the navigation link in the row for job BKEALCP3 and select the OMEGAMON for MVS – Job
Details link. The OMEGAMON for MVS – Job Details workspace opens and a logon window is displayed.
You enter a user ID and password for the terminal session (if required), and click OK. The Examine Details
for Job BKEALCP3 screen space in OMEGAMON for MVS opens in the terminal view (see Figure 9 on page
75).
74 IBM OMEGAMON for z/OS: Monitoring z/OS
Figure 9. Examine Details for Job BKEALCP3 screen space
You can now continue exploring BKEALCP3 by running an INSPECT of the job (enter s beside F INSPECT).
The INSPECT produces the results shown in Figure 10 on page 75.
Figure 10. Inspect results for BKEALCP3
Chapter 4. Monitoring scenarios 75
So you know that all the CPU is being generated in the CSECT KOSMIGSC. You narrow this to specific
instruction ranges by moving the cursor over the hot task and changing the OFFset and Granularity values
(see Figure 11 on page 76). So far you see that all the CPU is within the narrow range of instructions from
offset x’4A0’ to x’4F0’.
Figure 11. Drill down of INSPECT data
You narrow the granularity even further ( Figure 12 on page 77). Now you find that the looping
instructions lie between offsets x’4D5’ and x’4DF’. With this detail you can go back to the application
source and find the offending instructions. You can also take action from this interface to cancel the
looping job.
76 IBM OMEGAMON for z/OS: Monitoring z/OS
Figure 12. Further drill down on INSPECT data
To cancel the job, you navigate to the main menu for OMEGAMON for MVS by using the PF3 key until you
arrive at the OMEGAMON main menu (Figure 13 on page 77).
Figure 13. OMEGAMON main menu
From here you select ACTIONS to get to the Action Commands screen (Figure 14 on page 78).
Chapter 4. Monitoring scenarios 77
Figure 14. OMEGAMON Action Commands screen
And now you can select the OPS CMDS screen space (Figure 15 on page 79).
Note: Most commands, including the z/OS CANCEL command, require the user to be authorized. In
OMEGAMON for MVS you can gain authorization by typing in the /pwd command in the top line on the
screen and then entering the site specific password.).
78 IBM OMEGAMON for z/OS: Monitoring z/OS
Figure 15. OPS CMDS screen space
On the OPS CMDS screen, you enter the z/OS operator command to cancel job BKEALCP3. You can now
use the Backward navigation button on the Tivoli Enterprise Portal interface (upper left frame above the
navigation view in the desktop client) or the back button on your browser (in a browser client) to return to
the Address Space Bottleneck Summary view.
The BKEALCP3 job is no longer active (Figure 16 on page 80).
Chapter 4. Monitoring scenarios 79
Figure 16. Address Space Bottlenecks Summary workspace without BKEALCP3
80 IBM OMEGAMON for z/OS: Monitoring z/OS
Support information
If you have a problem with your IBM software, you want to resolve it quickly. IBM provides the following
ways for you to obtain the support you need:
Online
Go to the IBM Software Support site at http://www.ibm.com/software/support/probsub.html and
follow the instructions.
Troubleshooting Guide
For more information about resolving problems, see Introduction to troubleshooting.
© Copyright IBM Corp. 2004, 2022 81
82 IBM OMEGAMON for z/OS: Monitoring z/OS
Notices
This information was developed for products and services offered in the U.S.A. IBM may not offer the
products, services, or features discussed in this document in other countries. Consult your local IBM
representative for information on the products and services currently available in your area. Any reference
to an IBM product, program, or service is not intended to state or imply that only that IBM product,
program, or service may be used. Any functionally equivalent product, program, or service that does not
infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to
evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this
document. The furnishing of this document does not give you any license to these patents. You can
send license inquiries, in writing, to:
IBM Director of Licensing
IBM Corporation
North Castle Drive
Armonk, NY 10504-1785 U.S.A.
For license inquiries regarding double-byte (DBCS) information, contact the IBM Intellectual Property
Department in your country or send inquiries, in writing, to:
IBM World Trade Asia Corporation
Licensing
2-31 Roppongi 3-chome, Minato-ku
Tokyo 106, Japan
The following paragraph does not apply to the United Kingdom or any other country where such
provisions are inconsistent with local law:
INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS"
WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED
TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A
PARTICULAR PURPOSE.
Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore,
this statement might not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically
made to the information herein; these changes will be incorporated in new editions of the publication.
IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this
publication at any time without notice.
Any references in this information to non-IBM Web sites are provided for convenience only and do not in
any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of
the materials for this IBM product and use of those Web sites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without
incurring any obligation to you.
Licensees of this program who wish to have information about it for the purpose of enabling: (i) the
exchange of information between independently created programs and other programs (including this
one) and (ii) the mutual use of the information which has been exchanged, should contact:
IBM Corporation
2Z4A/101
11400 Burnet Road
Austin, TX 78758 U.S.A.
© Copyright IBM Corp. 2004, 2022 83
Such information may be available, subject to appropriate terms and conditions, including in some cases
payment of a fee.
The licensed program described in this document and all licensed material available for it are provided by
IBM under terms of the IBM Customer Agreement, IBM International Program License Agreement or any
equivalent agreement between us.
If you are viewing this information in softcopy form, the photographs and color illustrations might not be
displayed.
Trademarks
IBM, the IBM logo, and ibm.com® are trademarks or registered trademarks of International Business
Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked
terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these
symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information
was published. Such trademarks may also be registered or common law trademarks in other countries.
A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at
http://www.ibm.com/legal/copytrade.shtml.
Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon,
Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or
its subsidiaries in the United States and other countries.
Linux® is a trademark of Linus Torvalds in the United States, other countries, or both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the
United States, other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Other company, product, and service names may be trademarks or service marks of others.
Privacy policy considerations
IBM Software products, including software as a service solutions, (“Software Offerings”) may use cookies
or other technologies to collect product usage information, to help improve the end user experience,
to tailor interactions with the end user or for other purposes. In many cases no personally identifiable
information is collected by the Software Offerings. Some of our Software Offerings can help enable you
to collect personally identifiable information. If this Software Offering uses cookies to collect personally
identifiable information, specific information about this offering’s use of cookies is set forth below.
Depending upon the configurations deployed, this Software Offering may use session cookies that
collect each user’s user name for purposes of session management, authentication, and single sign-on
configuration. These cookies cannot be disabled.
If the configurations deployed for this Software Offering provide you as customer the ability to collect
personally identifiable information from end users via cookies and other technologies, you should seek
your own legal advice about any laws applicable to such data collection, including any requirements for
notice and consent.
For more information about the use of various technologies, including cookies, for these purposes,
See IBM’s Privacy Policy at http://www.ibm.com/privacy and IBM’s Online Privacy Statement at http://
www.ibm.com/privacy/details the section entitled “Cookies, Web Beacons and Other Technologies”
and the “IBM Software Products and Software-as-a-Service Privacy Statement” at http://www.ibm.com/
software/info/product-privacy.
84 IBM OMEGAMON for z/OS: Monitoring z/OS
IBM®
Product Number: 5698-T01
SC27-4028-02