HeartBeat

From LQWiki
Jump to navigation Jump to search

HeartBeat is a high-availability-cluster software for Linux that allows you to make a service highly available without needing some kind of shared storage. The concept is that one cluster node offers a service on a virtual IP-address. When this computer breaks down, another cluster node takes over the virtual ip address and continues serving.

Installation

Architecture

sense name IP virtual/physical
first cluster node heartbeat-1 192.168.0.21 virtual
second cluster node heartbeat-2 192.168.0.22 virtual
iscsi storage tweedleburg 192.168.0.5 physical
virtual IP address gothere 192.168.1.1 virtual

Process

  • On heartbeat-*, install SLES 10 as per default, but I disable ZMD and the firewall.
  • On tweedleburg, install an iscsi storage.
  • establish passwordless login between heartbeat-1 and -2.
  • adapt /etc/hosts on both nodes so that it contains
192.168.0.21    heartbeat-1.site heartbeat-1
192.168.0.22    heartbeat-2.site heartbeat-2
  • configure the cluster on node 1
heartbeat-1:~ # yast2 heartbeat

Choose MD5 as auth method and myauth as password.

  • install heartbeat on node 2
heartbeat-1:~ # yast -i heartbeat
  • propagate the heartbeat configuration
heartbeat-1:~ # scp /etc/ha.d/ha.cf root@heartbeat-2:/etc/ha.d
heartbeat-1:~ # scp /etc/ha.d/authkeys root@heartbeat-2:/etc/ha.d
  • start the cluster
heartbeat-1:~ # /etc/init.d/heartbeat start
heartbeat-2:~ # /etc/init.d/heartbeat start
  • set a password for the hacluster admin
heartbeat-1:~ # passwd hacluster
  • configure the cluster
heartbeat-1:~ # hb_gui

Log in as hacluster user to heartbeat-1

Shared storage

  • build sfex from http://www.linux-ha.org/sfex on both nodes
  • set up an iscsi initiator on both nodes, we assume it makes the device /dev/sdb.
  • test sfex by locking /dev/sdb
heartbeat-1:~ # /usr/lib64/heartbeat/sfex_init /dev/sdb
heartbeat-1:~ # /usr/lib64/heartbeat/sfex_lock /dev/sdb
acquired lock successfully.
heartbeat-1:~ # /usr/lib64/heartbeat/sfex_stat /dev/sdb
control data:
  magic: 0x01, 0x1f, 0x71, 0x7f
  version: 1
  revision: 3
  blocksize: 512
  numlocks: 1
lock data #1:
  status: lock
  count: 2
  nodename: heartbeat-1
status is LOCKED.
  • test the other cluster node realizes the lock:
heartbeat-2:~/sfex-1.3 # /usr/lib64/heartbeat/sfex_stat /dev/sdb
control data:
  magic: 0x01, 0x1f, 0x71, 0x7f
  version: 1
  revision: 3
  blocksize: 512
  numlocks: 1
lock data #1:
  status: lock
  count: 2
  nodename: heartbeat-1
status is UNLOCKED.

Configure cluster resources

In the hb_gui program, you add a new resource group named resource_group. You add the first resource, the sfex device. You add the start, stop and monitor operations to this.

Commands

To see if your cluster is up and running, use the cluster resource manager, i.e. like this:

crm_mon -i5