About Clustering
  • 22 Sep 2021
  • 3 Minutes to read
  • Dark
    Light
  This documentation version is deprecated, please click here for the latest version.

About Clustering

  • Dark
    Light

Article summary

Overview

A Cluster is a configuration of one or more Decisions Application Servers running on the same database back-end. The servers work together to appropriately distribute load/work so that one server is not over-or under-utilized. 

Important Note
The use of clustering in development environments is not supported by the platform.
Common Cluster Configuration 
The most common configuration is to have a Load Balancer sit in front of two Decisions servers that distribute work to those servers. 

High Availability

High Availability configurations require the removal of any single point of failure in the processing chain. This is done by creating a Cluster of at least two Decisions Application Servers, but also requires that the customer configures and maintains a Load Balancer that is capable of routing traffic and running Health Checks on Application Servers. 

Note on High Availability Environments 
For a truly resilient High Availability environment, it is also recommended to use a clustered SQL Server environment to prevent database failures from interrupting service.

High Availability is configured by establishing an Active-Active Cluster, preferably with geographic separation (or different regions or zones in MS or Amazon's Cloud offerings). The Active-Active Cluster is able to provide day-to-day processing benefits by having more computing power dedicated to the system, but most importantly the two application servers provide redundancy.

Warning About Active-Passive Cluster
Though Decisions can also be configured in an Active-Passive Cluster (redundancy is present but both servers do not operate together to process everyday transactions), this is an uncommon scenario and is not generally recommended.

Transaction Data and Peer Communication

The Decisions platform relies on a set of services and capabilities that are mostly stateless, such as designing a Rule or a Workflow. The processes that customers produce may be very long-running and stateful, long-running and stateless, or short running and stateless.

All Workflows and Rules have the ability to store data. This Stored Data is immediately written and not at risk during an outage. Uncommitted Data in a Flow or a Rule is the only data that is potentially at risk. 

Stateful Workflows are able to be resumed at their last state making them more resilient when service is interrupted. Short Running Workflows execute in milliseconds so the possibility of loss during an outage is minimal.

The servers in a Cluster communicate with one another to clear Cached Data. The servers do not send large complex messages to one another to maintain state. Instead, the servers let one another know when data has changed and should, therefore, be reloaded from the system of record. This makes Cluster communication very efficient.

This approach ensures performance, and when considering how Clusters Failover, also has implications. The easiest way to think about the implication is to imagine a currently executing Flow or Rule that is actively executing on the VM (or the Container's CPU Resources), and is only present and understood on that VM (or container) at that moment. If the server experiences an abortive interruption, such as a power outage, that execution of the Flow or Rule engine will be lost. In a Clustered Environment this is also true, but any subsequent executions of the Rule and Flow engine will be run on the still operating server to minimize disruption.

When a Flow or Rule's execution is critical, and even this very small chance of interruption is unacceptable, there is a pattern of "Leased Work" and "Work Queues" that can be used for reliable execution of a Flow/Rule and retry attempts.

Additional Information 
For further information on these capabilities, see:
Messaging Reliance 
It is also possible to rely on messaging using Rabbit, Kafka, Azure SB, AWS, and other services that may fit a customer's architecture. 

Multi-Tenant and Clusters

With a Multi-Tenant Decisions environment, there is not a technical difference in the way that the pieces operate. There is one caveat: Decisions Multi-tenant allows the Administrator to assign a Tenant Instance to one or more servers in a Cluster without assigning the Tenant to all Nodes. This configuration is uncommon and will require additional configuration on the Load Balancer.

Note on Configuration 
This configuration is uncommon and requires additional configuration on the Load Balancer
Additional Information
For more information on configuration options, see Deployment and Configuration Options.

Was this article helpful?