System and Application Error Handling Overview
  • 24 Nov 2020
  • 2 Minutes to read
  • Dark
    Light

System and Application Error Handling Overview

  • Dark
    Light

Overview

This article is meant to serve as an overview of three important factors when error handling on a server in Production. This article will outline an example framework of how errors can be managed across a server. This is necessary because certain errors cannot be caught or managed within Decisions or are better managed with other tools. These factors include monitoring uptime, handling process errors in the Decisions environment, and handling system errors outside of the environment.

Uptime

Uptime confirms that the Decisions service is active and available. The Decisions login page will show an error and exception message when Service Host Manager is down just to inform the user that service cannot be reached, but not as a form of in-depth monitoring. An Uptime Monitoring Tool is recommended to watch the status of a site, below are some of the tools available.

These tools work by checking a web page at regular time intervals (usually every five to 15 minutes) and monitor the response. If the response returns an error or the request couldn’t reach it, the tool sends notifications to inform of the outage so that the administrator can take appropriate action. Some examples of uptime monitoring tools are Nodeping (what we use to monitor our hosted environments) and Pingdom Service Uptime Host (Tracker).

Process Errors

Process errors are errors that occur in the Decisions workflow logic at runtime and are specifically related to the Decisions application itself. Some severe errors in the process at runtime can be recorded as "System errors" by default, but they would still need to be managed and resolved from within a Decisions environment. 

Flows can be configured to catch unhandled exceptions using either the Catch Exception step or the On Exception outcome path of a step. For best practices handling exceptions in the Decisions Studio, please visit the Exception Handling Best Practices article.

System Errors

System-level errors happen outside of the Decisions environment. They are related to a range of functional factors potentially involving servers, host machines, or other internal/external services and applications that are tethered to the Decisions environment (i.e. Service Host Manager). Another example could include a Decisions Flow that executes when triggered by an API call, but the call is poorly formatted and does not reach Decisions; Process error handling could not capture this exception in immerse detail.

A recommended way to manage these errors is with a Log Ingestion Tool. These software tools are capable of reading into errors from Decisions at a more granular level than Decisions onboard exception handling. They take in logs from Decisions and other sources, filtering out unnecessary information, and combining the relevant pieces to provide detailed system-level information for an error event.

For these tools to be beneficial, they need to be configured to ingest more than just the Decisions logs. A Log Ingestion Tool could be configured to take in data from the Decisions Service Host Manager logs, IIS logs, LM logs, Event Viewer logs, SQL Server Logs, etc. Many of these tools will generate reports on the information collected from error events.


Was this article helpful?