Troubleshoot Stack Monitoring

The topics in this section provide troubleshooting information to identify and address common issues that may occur while working with Stack Monitoring.

Troubleshoot General Issues

New permissions in resource-types are not propagated

This happens because IAM does not recompile a policy unless there is a change to the policy statement.

For any existing policies that use resource-types, when new permissions are added to the resource-type, edit the policy by adding a blank space. Then, save the policy.

For more information, see New permissions in resource-types are not propagated.

Troubleshoot PeopleSoft

Discovery Job Behavior


discovery job behavior error

This is an example of logs for two Process Scheduler Domain work items, one successful and saved, and the other presenting a domain down error. Each detailed log with its respective work item id.


discovery job behavior errors

Discovery Error Messages

Database validation failed error

The example below is the output from a failed discovery job. Using the Work Item (WI) ID, search through out the entire log message for additional details to determine the cause of the failed discovery.


psft general error

Message Error Troubleshooting

database validation error

Fetchlet exception error displays Invalid username/password, logon denied.

A password/username input validation is necessary to ensure entering the correct credentials for our discovery job in Database Credentials section.


troubleshooting db validation


database validation error

Message displays IO Error: The Network Adapter could not establish a connection due to UnknownHostException.

Then displays the host entered and the message Name or service not known. This indicates that host used in discovery job is either incorrect or had a typo error when capturing data for discovery job in PSFT Database section.


troubleshooting db validation

Validate it and then retry the discovery job.


database validation error

Error due to a connection failure with PSFT Database displays Connection refused, socket connect lapse.

Then the host name and direction is displayed along with its port.

This is an error triggered when the Database Port is incorrect. Retry discovery with the right port.


troubleshooting db validation


database validation error

Error message with a fetchlet exception and the log displays Listener refused the connection with the following error: ORA-12514, TNS: listener does not currently know of service requested in connect descriptor.

Retry discovery job and enter the right Database Service Name under PSFT Database section.


troubleshooting db validation

Resource families validation failed error

PeopleSoft has the following resource families:

  • Application Server Domain
  • Process Scheduler Domain
  • PeopleSoft Internet Architecture (PIA)

There can be several resources of each family in a discovery job. A discovery job will be marked as successful if at least one resource of each type is successful. Therefore, a job can be successful even if there are some work items failing for some child resources.

In case of error, it will show the following logs:

General error message


general error

This describes that none of the resource families in the discovery job met the requitement of having at least one resource successful for each family. Then provides a list of resources families and shows the next summary logs (one for each family failing):

Summary of failing work items


summary of failing work items

This log example provides a list of failed work items for App Server family resources. Using the provided work items ids get the rest of logs with more details about the failures. Each work item can fail for different reasons and it is important to refer to each work item id in logs to see specifics. The following are possible errors for each work item and its solution.

Message Error Troubleshooting

resource families validation error

This type of error appears when discovery is provided with invalid credentials.

Example on left column shows a description for an Application Server Domain work item: "Discovery failed for oracle_psft_appserv", but this error is also applicable to process scheduler domain (oracle_psft_pcrs).

To fix this error enter the right credentials under that section.


resource families validation error troubleshooting


resource families validation error

This error indicates a domain is down for the resource that failed in discovery.

To fix it, verify that the application is running in PeopleSoft console, and turn the domain back on.

This type of error can occur for Process Scheduler Domain and AppServer Domain, with the failed to retrieve message, NameNotFoundException.

At the beginning of the log see which work item failed and also the reference to the work item id to easily identify the resource failing.


resource families validation error

This error occurs when there is a misconfiguration for a PIA domain (down status).

Elasticsearch errors

If Elastic Search is discovered together with PeopleSoft discovery, this work item discovery will define the success or fail of the PeopleSoft discovery. If an error occurs while discovering Elastic Search and the work item fails, then the PeopleSoft discovery job will not be successful either.

The following is the message shown when an Elastic Search error appears. It provides a work item id to find detailed logs about what is provoking the failure.

General Error


elastic search general error


elastic search general error

Message Error Troubleshooting

elastic search error

Failed to collect data, 500 SERVER ERROR.

There was an error trying to connect and collect data from the specified host.

This error log happens when an invalid username in the discovery was provided.


elastic search error

Failed to collect data, status 401.

Unauthorized access due to invalid credentials.

Ensure entering the right password while performing the discovery.


elastic search error

FileNotFoundException.

TrustStore path location provided is incorrect.

This could be due to a mistyped value entered in TrustStore path field or the file does not exists in the specified location.

Also, please ensure that the file is accessible on the agent host.


elastic search error

Password verification failed.

The TrustStore password provided is incorrect.

Troubleshoot SOA

Monitoring SOA applications created from Marketplace images:

When a SOA application is provisioned using Market place Image, then data in SOA related metrics are not populated. The Marketplace images places SOA and WebLogic configuration files in two seperate locations. To populate the SOA metrics, copy the configuration files from the configuration files to the WebLogic directory.

Please copy the files as indicated and restart Weblogic.

SOA Infra Metrics will start appearing in a few minutes after Weblogic restart

Marketplace image is installing SOA Suites in a different location than the Weblogic stack

/u01/app/oracle/middleware — Weblogic
/u01/app/oracle/suite/  --- SOA Suite

Please copy the following files:

From: /u01/app/oracle/suite/em/adml

-rwxrwxr-x. 1 oracle oracle 21156 May 18 2011 server-scheduler_service.xml

-rwxrwxr-x. 1 oracle oracle 15788 May 18 2011 domain-scheduler_service.xml

-rwxrwxr-x. 1 oracle oracle 2929 Nov 11 2013 server-bea_alsb.xml

-rwxrwxr-x. 1 oracle oracle 242238 Feb 28 2016 server-oracle_soainfra.xml

-rwxrwxr-x. 1 oracle oracle 232504 Jul 10 2016 server-oracle_soainfra_partition.xml

-rwxrwxr-x. 1 oracle oracle 2992 Aug 15 2016 server-oracle_soa_composite-11.0.xml

-rwxrwxr-x. 1 oracle oracle 95241 Jan 16 2017 domain-oracle_soainfra.xml

To: /u01/app/oracle/middleware/em/adml

Troubleshoot a Maintenance Window

Retry a Maintenance Window

A retry can be performed only after an operation is marked as Partial Success, for Active Maintenance Windows.

Access the actions menu of the Maintenance Window to access the Retry option.

Updated topology

When a resource changes its topology, like a cluster adding or removing one or several of its servers, the Maintenance Window is not automatically updated. To updated the resources included in the Maintenance Window after a topology change, it's necessary to edit the Maintenance Window according to the resource's new topology.