Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Contents
1 Introduction
2 System Configuration
3 Kernel Configuration
4.1 DRBD
4.2 OCFS2
5 Configuration
o
5.1 DRBD
5.2 OCFS2
7 References
8 Resources
[edit] Introduction
This is a rewrite of the orginal article by Mathijs and edited by Mars105, most of the credits goes to them.
The main change is the re-appearance of the ocfs2 ebuild in the main portage tree and the new built in drbd
kernel module (>=2.6.33).
You may want to leave "OCFS2 expensive checks" unchecked. For the other items, read the help.
Alternatively, you can compile these options as modules and then add the following modules to
/etc/conf.d/modules:
ocfs2
ocfs2_dlmfs
configfs
ocfs2_dlm
You may need to unmask the ebuild. (Replace amd64 with your platform)
echo "=sys-cluster/drbd-<your module version>-* ~amd64" >>
/etc/portage/package.keywords
[edit] OCFS2
Unmasking the needed ebuilds. (Replace amd64 with your platform)
echo "sys-fs/ocfs2-tools ~amd64" >> /etc/portage/package.keywords
Emerging ocfs2-tools
[edit] Configuration
[edit] DRBD
Now we tell drbd how it must operate. Here we set two nodes (Tweedledum and Tweedledee) on which we
created before a disk located at /dev/vg1/xen. Our drbd device will be /dev/drbd1 on both nodes, the
resource name is drbd_xen.
File: /etc/drbd.d/global_common.conf
# Common settings, applies to all resources
common
{
startup
{
wfc-timeout 60;
# Wait for connection timeout
degr-wfc-timeout 60;# Wait for connection timeout, if this node was a degraded cluster
}
disk
{
on-io-error detach; # Drop the disk on io error
}
syncer
{
rate 10M;
# Limit sync speed to 10 MByte/s for FastEthernet
}
protocol C;
}
File: /etc/drbd.conf
# drbd_xen Our drbd space
resource drbd_xen
{
on Tweedledum
{
device
/dev/drbd1;
disk
/dev/vg1/xen;
address
192.168.0.31:7789;
meta-disk internal;
}
on Tweedledee
{
device
/dev/drbd1;
disk
/dev/vg1/xen;
address
192.168.0.32:7789;
meta-disk internal;
}
}
Warning: rate 10M means 10 MByte/sec, it will transfer with a rate of 80MBit/sec on a FastEthernet port.
Note: For more informations see the man page of drbd configuration file
man drbd.conf
[edit] OCFS2
Some filesystems must be added to our fstab. The first two are needed by ocfs2 to work. The last is our drbd, this
one is mandatory only for automatic boot time mount.
File: /etc/fstab
# Needed by ocfs2
none
/sys/kernel/config
none
/sys/kernel/dlm
configfs
ocfs2_dlmfs
defaults
defaults
ocfs2
noatime
/xen
0 0
0 0
0 0
ip_port = 7777
ip_address = 192.168.0.32
number = 1
name = Tweedledee
cluster = ocfs2_xen
cluster:
node_count = 2
name = ocfs2_xen
Note: you need to create this file yourself as the current ebuild at the time of this writing does not include it.
Now we must tell the init script which cluster must be started.
File: /etc/conf.d/ocfs2
OCFS2_CLUSTER="ocfs2_xen"
Note: Be sure to check your firewall for ports 7777 and 7789, or any ports you set. Otherwise things will not work as
expected.
Enable the resource. This step associates the resource with its backing device (or devices, in case of a multivolume resource), sets replication parameters, and connects the resource to its peer:
drbdadm up drbd-xen
Note: Executing theses commands you may receview messages like this one:
DRBD module version: 8.3.7
userland version: 8.3.8
preferably kernel and userland versions should match.
You may discard theses warnings. But make your checks for production use.
After issuing this command, the initial full synchronization will commence. You will be able to monitor its
progress via /proc/drbd. It may take some time depending on the size of the device.
watch cat /proc/drbd
version: 8.3.7 (api:88/proto:86-92)
built-in
1: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r---ns:20502072 nr:0 dw:0 dr:20510092 al:0 bm:1250 lo:146 pe:2782 ua:2048 ap:0 ep:1 wo:b
oos:105334296
[==>.................] sync'ed: 16.3% (102864/122876)M
finish: 0:35:27 speed: 49,468 (60,444) K/sec
After this is finished and you see consistent state at both sides, let's create the ocfs2 filesystem on the drbd
device.
Label: ocfs2_xen
Features: sparse backup-super unwritten inline-data strict-journal-super
Block size: 4096 (12 bits)
Cluster size: 4096 (12 bits)
Volume size: 128845049856 (31456311 clusters) (31456311 blocks)
Cluster groups: 976 (tail covers 6711 clusters, rest cover 32256 clusters)
Extent allocator size: 67108864 (16 groups)
Journal size: 268435456
Node slots: 2
Creating bitmaps: done
Initializing superblock: done
Writing system files: done
Writing superblock: done
Writing backup superblock: 4 block(s)
Formatting Journals: done
Growing extent allocator: done
Formatting slot map: done
Writing lost+found: done
mkfs.ocfs2 successful
This will create an OCFS2 file system with two node slots on /dev/drbd1, and set the filesystem label to
ocfs2_xen.
Warning: This step must be done on one node only. The filesystem will be created on the drdb device which is active and
will replicate to the nodes.
Restart DRBD
/etc/init.d/drbd restart
cat /proc/drbd
version: 8.3.7 (api:88/proto:86-92)
built-in
1: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r---ns:0 nr:0 dw:0 dr:408 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
If it looks fine and active-active (Primary/Primary), you can mount your new OCFS2 filesystem:
mount /xen
Cluster can be tested by creating a file on one node and reading on the other.
We can safely add drbd and ocfs2 on default run level.
rc-update add drbd default
rc-update add ocfs2 default
[edit] References
Some references the authors used and the reader may found usefull:
http://www.drbd.org/users-guide/ch-ocfs2.html
http://www.drbd.org/users-guide-emb/ch-configure.html
http://linux.dell.com/wiki/index.php/Set_up_an_OCFS2_cluster_filesystem
Edited by: Xerxes - 2011-10-16
[edit] Resources
This init script come from the package submited by Mathijs in is original article. The only modification made
here is to remove the checks for modules, as it will fail when DRBD is built in the kernel. And the mount check
will presumably fail if the modules are not loaded.
Fix me: Somebody with better init script understanding can maybe improve this script.
File: /etc/init.d/ocfs2
#!/sbin/runscript
# Copyright 1999-2006 Gentoo Foundation
# Distributed under the terms of the GNU General Public License v2
depend() {
need net localmount
before netmount
}
check_pseudofs() {
local retval=0
local HASMOUNT="mount -l -t"
if [ -z "`${HASMOUNT} configfs`" ] ; then
retval=1
fi
if [ -z "`${HASMOUNT} ocfs2_dlmfs`" ] ; then
retval=1
fi
0 0"
0 0"
/dlm
ocfs2_dlmfs
defaults
start() {
check_pseudofs || return $?
# A small hack to avoid:
# mount.ocfs2: Unable to access cluster service while trying to join the group
OCFS2_CLUSTER=$( sed -re 's/[[:space:]]+//g' /etc/ocfs2/cluster.conf | grep -A4
"cluster:" | grep 'name=' | cut -d= -f2 )
>/dev/null 2>&1
stop() {
# Shamelesly stolen from netmount
local ret
ebegin "Unmounting OCFS2 filesystems"
[ -z "$(umount -art ocfs2 2>&1)" ]
ret=$?
eend ${ret} "Failed to simply unmount filesystems"
[ ${ret} -eq 0 ] && return 0
declare -a siglist=( "TERM" "KILL" "KILL" )
local retry=0
local remaining="go"
while [ -n "${remaining}" -a ${retry} -lt 3 ]
do
remaining="$(awk '$3 ~ /'ocfs2'/ { if ($2 != "/") print $2 }' /proc/mounts
| sort -r)"
IFS=$'\n'
set -- ${remaining//\\040/ }
unset IFS
[ -z "${remaining}" ] && break
ebegin $'\t'"Unmounting ocfs2 filesystems (retry #$((retry+1)))"
/bin/fuser -k -${siglist[$((retry++))]} -m "$@" &>/dev/null
sleep 5
umount "$@" &>/dev/null
eend $? $'\t'"Failed to unmount filesystems"
done
einfo "Stopping OCFS2 cluster"
for cluster in ${OCFS_CLUSTERS}; do
done
>/dev/null 2>&1