Prerequisites
Contents
Packages
The following packages will need to be installed on both nodes. They are presented with respect to their location in the "stack" of the database cluster from low to high layer:
- kmod-drbd83 This package provides the kernel-side portion of drbd: basically, a new layer that inserts itself between the filesystem code and the block layer, and acts as a doubler for all the block IO, shunting packets across the network to a remote drbd instance.
- drbd83 Provides the administrative tools for management of the Distributed Replicated Block Device, interfacing with the kernel module. Version 8.3 gets new features (as compared to 8.2), but is considered stable enough for production use.
- cman The cluster management framework is provided by this package. It (and its dependencies) comprise the portion that does inter-node communication, providing the idea of a "virtual" host on which services can run. It can be thought of as providing a Cluster Node. It performs health checks at a Host level.
- rgmanager If cman provides the Cluster Node, then rgmanager provides the Cluster Service. Specifically, it provides the Resource Manager, which implements the high-level abstractions used by application software. It performs health checks at a Service level.
More detailed description of the packages that comprise the cluster are included in the Cluster Manager section.
Note that the DRBD software is not part of stock RedHat Enterprise Linux, but is provided instead with the CentOS Extras repository.
Note also that the Red Hat Cluster Suite is considered part of the Advanced Server system, and as such, will not be available with a stock RHEL subscription.
Block Devices
DRBD provides an abstraction of a block device to userspace. However, it is itself comprised of block devices on two different nodes, which it synchronizes across a network. As such, each replicant node must have its own block device underneath to back the replicated blocks. Thus, it exports a real block device to the layers above, less the space used for administrative overhead (its "metadata").
While DRBD provides redundancy itself over the network, further resilience is added by backing it with block devices which are themselves replicated at the physical layer. We will use software RAID (Linux MD) devices in our environment. As such, a total of four physical disk partitions will be required for the DRBD volume (two on each node).
While DRBD can be configured to use only part of the block devices, administration will be simplified if the underlying devices are identical in size.
NOTE: the use of LVM on the underlying device will add an additional capability not usually present in drbd: the ability to snapshot the blocks of the backing store on the slave node. DRBD itself disallows all IO on the slave node in the master-slave configuration in use at our site.
Network
Several components of our clustered database require network connectivity to communicate:
Data Link Protocols
Multicast must be supported at the Data Link layer for Cluster internode communication (i.e., aisexecd). Multicast is somewhat unique in that it has both a Data Link Layer component and a Network Layer component: Ethernet frames which carry IP multicast traffic are within a certain range of prefix bits for the Station Address put on the wire. This means that the switch which cluster nodes connect to has to recognize that the Ethernet frames are Multicast, and do something special with them.
Switches must either be configured to treat multicast Ethernet frames as broadcast (i.e., flood them out all switch ports), or -- as an optimization -- to listen to IGMP protocol leaves and joins from IP nodes attached to it, duplicating Multicast frames out of only those switch ports required by the IGMP group membership of attached stations. This is known as IGMP snooping, which is logically part of the Network Layer, but can be done at layer 2 by the switch thanks to the fact that IGMP packets will have a particular station address prefix.
Network Protocols
IP Multicast should be enabled in the kernel, and this is the default RHEL configuration. Also note that iptables should not be configured to drop multicast packets (i.e. the CIDR block 224/4), which many default iptables implementations will do for safety.
It is not necessary to use IGMP if all cluster nodes are on the same segment. If routing boundaries must be crossed, IGMP must be configured on the nodes and the routers which join them, or static multicast routes will need to be configured on the routing path between cluster nodes.
Since the nodes we will consider for this exercise will be on the same segment, this will not be necessary for us.
Transport Protocols
| drbd | 7789/tcp |
| cman | 5404/udp, 5405/udp |
| rgmanager | 41966/udp, 41967/udp, 41968/udp, 41969/udp |
| ccsd | 50006/tcp, 5000 |
Other Requirements
Finally, both nodes must know each other by name, and the name must correspond to the private link between them, so the components will communicate over it correctly. In particular, the following commands should yield identical results on all nodes:
$ getent hosts node1 $ getent hosts node2 $ getent hosts serviceip
Note that other components of RHCS -- which we don't use in this document -- have additional requirements:
- GUI cluster admin (Conga, redhat-config-cluster)
- dlm (for GFS)
- gnbd (export a block device via network)
Once all these prerequisites are in place, the implementation can proceed.