Before we start to look at MSCS clusters, we need to distinguish between them and WebSphere MQ clusters:
In the rest of this book, clusters means WebSphere MQ clusters. In this chapter, clusters always means MSCS clusters.
Let us start by looking at a two-machine MSCS cluster. A two-machine cluster comprises two computers (for example, A and B) that are jointly connected to a network for client access using a virtual IP address. They might also be connected to each other by one or more private networks. A and B share at least one disk for the server applications on each to use. There is also another shared disk, which must be a redundant array of independent disks (RAID) Level 1, for the exclusive use of MSCS; this is known as the quorum disk. MSCS monitors both computers to check that the hardware and software are running correctly.
In a simple setup such as this, both computers have all the applications installed on them, but only computer A runs with live applications; computer B is just running and waiting. If computer A encounters any one of a range of problems, MSCS shuts down the disrupted application in an orderly manner, transfers its state data to the other computer, and re-initiates the application there. This is known as a failover. Applications can be made cluster-aware so that they interact fully with MSCS and failover gracefully.
A typical setup for a two-computer cluster is as shown in Figure 28.
Figure 28. Two-computer MSCS cluster
Each computer can access the shared disk, but only one at a time, under the control of MSCS. In the event of a failover, MSCS switches the access to the other computer. The shared disk itself is usually a RAID, but need not be.
Each computer is connected to the external network for client access, and each has an IP address. However an external client, communicating with this cluster, sees just the one virtual IP address, and MSCS routes the IP traffic within the cluster appropriately.
MSCS also performs its own communications between the two computers, either over one or more private connections or over the public network, in order to monitor their states using the heartbeat, keep their databases in sync, and so on.