Troubleshooting Network Problems and Clusters

In unstable network environments, the mail notifications that you configured as described in Cluster Email Notifications let you know that unknown events are affecting your cluster. The following table provides some troubleshooting guidelines for interpreting the cluster email notifications you may receive in an unstable network environment:

Table 1. Troubleshooting network problems and clusters

Type of Email

Possible Interpretation

One mail message stating that a standby node is no longer part of the cluster.

  • A planned outage on a standby node has occurred.

  • Transient latency has caused a standby node to enter discovery mode, and it has come back online as a standby node.

A series of mail messages stating that a standby node is no longer part of a cluster.

Excessive latency is occurring repeatedly, but it is transient enough that a standby node has not yet become active and caused a split brain to occur.

A single mail message stating that a failover has occurred.

A routine failover has occurred.

Two mail messages in quick succession, one stating that a specific IP address is no longer part of the cluster, and the other stating that a failover has occurred and the same IP address is the new active node.

A network outage or excessive latency has caused the cluster to enter a split brain state.

A single mail message stating that the cluster no longer has multiple active nodes, following the two previous mail messages.

The self-healing mechanism has rejoined a split brain caused by a network outage or excessive latency.

The above table is not exhaustive. For example, a network outage may also affect the ability of a cluster node to contact the mail server, or the mail server to contact you. If you receive any cluster email-related alert, you always must investigate the health of your cluster.