Title

Autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero eros et dolore feugait

Tag Archives:cluster

CentOS Cluster-DRBD Setup

As part of a MySQL Cluster setup, I recently setup a 2-node web cluster using CentOS’s native cluster software suite with a twist. The web root was mounted on a DRBD partition in place of periodic file sync’ing. This article focuses on the cluster setup and does not cover the DRBD setup/configuration. Let’s go:

Install “Cluster Storage” group using yum:

[root@host]# yum installgroup “Cluster Storage”

Edit /etc/cluster/cluster.conf: ======================================================
<?xml version=”1.0″?>
<cluster name=”drbd_srv” config_version=”1″>
<cman two_node=”1″ expected_votes=”1″>
</cman>
<clusternodes>
<clusternode name=”WEB_Node1″ votes=”1″ nodeid=”1″>
<fence>
<method name=”single”>
<device name=”human” ipaddr=”10.255.255.225″/>
</method>
</fence>
</clusternode>
<clusternode name=”WEB_Node2″ votes=”1″ nodeid=”2″>
<fence>
<method name=”single”>
<device name=”human” ipaddr=”10.255.255.226″/>
</method>
</fence>
</clusternode>
</clusternodes>
<fence_devices>
<fence_device name=”human” agent=”fence_manual”/>
</fence_devices>
</cluster>
======================================================

Start the cluster:

[root@host]# service cman start (on both nodes)

To verify proper startup:

[root@host]# cman_tool nodes

Should show:

Node  Sts   Inc   Joined               Name
1   M     16   2009-08-11 22:13:27  WEB_Node1
2   M     24   2009-08-11 22:13:34  WEB_Node2

Status ‘M’ means normal, ‘X’ would mean there is a problem

Edit /etc/lvm/lvm.conf

change:

locking_type = 1

to:

locking_type = 3

and change:

filter = [“a/.*/”]

to:

filter = [ “a|drbd.*|”, “r|.*|” ]

Start clvmd:

[root@host]# service clvmd start (on both nodes)

Set cman and clvmd to start on bootup on both nodes:

[root@host]# chkconfig –level 345 cman on
[root@host]# chkconfig –level 345 clvmd on

Run vgscan:

[root@host]# vgscan

Create a new PV (physical volume) using the drbd block device

[root@host]# pvcreate /dev/drbd1

Create a new VG (volume group) using the drbd block device

[root@host]# vgcreate name-it-something /dev/drbd1 (VolGroup01, for example)

Now when you run vgdisplay, you should see:

— Volume group —
VG Name             VolGroup01
System ID
Format                lvm2
Metadata Areas        1
Metadata Sequence No  1
VG Access             read/write
VG Status             resizable
Clustered             yes
Shared                no
MAX LV                0
Cur LV                0
Open LV               0
Max PV                0
Cur PV                1
Act PV                1
VG Size               114.81 GB
PE Size               4.00 MB
Total PE              29391
Alloc PE / Size       0 / 0
Free  PE / Size       29391 / 114.81 GB
VG UUID               k9TBBF-xdg7-as4a-2F0c-XGTv-M2Wh-CabVXZ

Notice the line that reads “Clustered yes”

Create a LV (logical volume) in the PV that we just created.
**Make sure your drbd nodes are both in primary roles before doing this.

If there is a “Split-Brain detected, dropping connection!” entry in /var/log/messages, then a manual split-brain recovery is necessary.

To manually recover from a split-brain scenario (on the split-brain machine):

[root@host]# drbdadm secondary r0
[root@host]# drbdadm — –discard-my-data connect r0

On verified that both nodes are in primary mode:

On node2:

[root@host]# service clvmd restart

On node1:

[root@host]# lvcreate -l 100%FREE -n gfs VolGroup01

Format the LV with gfs:

[root@host]# mkfs.gfs -p lock_dlm -t drbd_srv:www -j 2 /dev/VolGroup01/gfs (drbd_srv is the cluster name, www is a locking table name)

Start the gfs service (on both nodes):

[root@host]# service gfs start

Set the gfs service to start automatically (on both nodes):

[root@host]# chkconfig –level 345 gfs on

Mount the filesystem (on both nodes):

[root@host]# mount -t gfs /dev/VolGroup01/gfs /srv

Modify the startup sequence as follows:

1) network

2) drbd  (S15)
3) cman  (S21)
4) clvmd (S24)
5) gfs   (S26)

Remove the soft link for shutting down openais (K20openais in runlevel 3 for our install) because shutting down cman does the same.

Modify the shutdown sequence as follows (in reverse order as startup sequence):

1) gfs   (K21)
2) clvmd (K22)
3) cman  (K23)
4) drbd  (K24)

5) network

Reboot each server to test availability of gfs filesystem during reboot.

Troubleshooting:

If one node’s drbd status is showing:

version: 8.2.6 (api:88/proto:86-88)
GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by buildsvn@c5-x8664-build, 2008-10-03 11:30:17

1: cs:StandAlone st:Primary/Unknown ds:UpToDate/DUnknown   r—
ns:0 nr:0 dw:40 dr:845 al:1 bm:3 lo:0 pe:0 ua:0 ap:0 oos:12288

and the other node is showing:

version: 8.2.6 (api:88/proto:86-88)
GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by buildsvn@c5-x8664-build, 2008-10-03 11:30:17

1: cs:WFConnection st:Primary/Unknown ds:UpToDate/DUnknown C r—
ns:0 nr:0 dw:56 dr:645 al:3 bm:2 lo:0 pe:0 ua:0 ap:0 oos:8200

Also, the node with cs:WFConnection in the status field should have the gfs filesystem mounted.

Then, there is a drbd sync issue.  To solve this, just restart the gfs cluster services on the node with “cs:Standalone” in the status field by:

1) service clvmd stop
2) service cman stop
3) service drbd restart

Verify you see Primary/Primary in the st: section of the drbd status (cat /proc/drbd)

4) service cman start
5) service clvmd start
6) service gfs start