Linux heartbeat installation

From Wiki
Jump to: navigation, search


Requirements

Network

A separate NIC should be installed for heartbeat and other private services.

Preparation

Heartbeat is part of the EPEL repo. To install the package you will need to install the epel-release package first.

 rpm -ivh http://mirrors.liquidweb.com/fedora-epel/6/x86_64/epel-release-6-8.noarch.rpm

Firewall

Each node in the HA cluster must be configured to accept connections from its peers on the UDP port defined in /etc/ha.d/ha.cf (default 694).

Configuration File Permissions

For the sake of security (and the fact that heartbeat will not start if the permissions of its configuration files are too lax), the following files should be set to mode 600:

  • /etc/ha.d/authkeys
  • /etc/ha.d/ha.cf
  • /etc/ha.d/haresources
touch /etc/ha.d/{authkeys,ha.cf,haresources}
chmod 600 /etc/ha.d/{authkeys,ha.cf,haresources}

STONITH

Even with heartbeat in place and functioning properly, a HA setup is not truly HA unless there is a way to guarantee that resources transfer from a failed node to a good node. A STONITH device should be used to avoid a split brain scenario.

Configuration

Configuration files must be identical across the entire cluster. Use git/puppet/rsync to manage the configuration files.

/etc/ha.d/authkeys

The authkeys file defines the authentication method that allows the two HA nodes to communicate with each other. Generate a random string for this file.

pwgen 32 1

The file should look like this:

auth 1
1 sha1 <password>

/etc/ha.d/ha.cf

The ha.cf file is the main configuration file for heartbeat. It defines the manner in which the two nodes in the HA cluster communicate with each other in order to determine if resource takeover is necessary. Example config:

# Logging
logfacility local0
debugfile /var/log/ha-debug
logfile /var/log/ha-log
   
# Network
udpport 694
bcast eth1
auto_failback off 
   
# Node Info
node lclient1.watters.ws
node lclient2.watters.ws
   
# Timeout Values
keepalive 2 #Sets the time between heartbeats to 2 seconds.
deadtime 30 # the time to wait before declaring a node to be dead
initdead 40 # this is the time before heartbeat will start the resources the first time it starts

Note: The node entries must match the name returned by `uname -n` or heartbeat will refuse to start.

/etc/ha.d/haresources

The haresources file can manage any service that has an LSB compliant init script. There are also default providers such as IPaddr to manage floating (service) IPs.

  • HOSTNAME1 - The hostname of one of the two nodes that is determined to be the "main" node
  • PUBLICHAIP - The public IP address that floats between the two servers, controlled solely by heartbeat

An important caveat is that the haresources file is considered to be a single-line string, everything must be on one line.

HOSTNAME1 IPaddr::PUBLICHAIP

A good rule of thumb is that the IP addresses should be the last item to appear in the haresources file. This way services will be started before the box starts getting traffic.

Here are some example resources that can be used.

  • DRBD
drbddisk::DRBDRESOURCENAME
  • LVM
LVM::LOGICALVOLUMENAME
  • File Systems
Filesystem::/path/to/device::/path/to/mount/point::fstype
  • Pretty much anything in /etc/init.d