What is GFS?
GFS allow all nodes to have direct CONCURRENT write access to the same shared BLOCK storage.
For local file system e.g ext3, A shared BLOCK storage can be mounted in multiple nodes, but CONCURRENT write access is not allowed
For NFS, the CONCURRENT write access is allowed, but it is not direct BLOCK device, which introduce delay and another layer of failure.
GFS requirements:
– A shared block storage (iSCSI, FC SAN etc.. )
– RHCS (Red hat Cluster suite) (although GFS can be mounted in standalone server without cluster, it is primarily used for testing purpose or recovering data when cluster fails)
– RHEl 3.x onwards (RHEL derivatives: Centos/Fedora), it should work in other Linux distributions, since GFS and RHCS have been open sourced.
GFS specifications:
– RHEL 5.3 onwards use GFS2
– RHEl 5/6.1 supports maximum 16 nodes
– RHEL 5/6.1 64 bit supports maximum file system size of 100TB (8 EB in theory)
– Supports: data and metadata journaling, quota, acl, Direct I/O, growing file system online, dynamic inodes (convert inode block to data block)
– LVM snapshot of CLVM under GFS is NOT yet supported.
GFS components:
RHCS components: OpenAIS, CCS, fenced, CMAN and CLVMD (Clustered LVM)
GFS specific component: Distributed Lock Manager (DLM)
Install RHCS and GFS rpms
Luci (Conga project) is the easiest way to install and configure RHCs and GFS.
#GFS specific packages:
#RHEL 5.2 or lower versions
$yum install gfs-utils kmod-gfs
#RHEL 5.3 onwards, gfs2 module is part of kernel
$yum install gfs2-utils
Create GFS on LVM
You can create GFS on raw device, but LVM is recommended for consistent device names and the ability to extend device
#Assume you have setup and tested a working RHCS
#Edit cluster lock type in /etc/lvm/lvm.conf on ALL nodes
locking_type=3
#Create PV/VG/LV as if in standalone system ONCE in any ONE of the nodes
#Start Cluster and clvmd on ALL nodes
#Better use luci GUI interface to start whole cluster
$Service cman start
$Service rgmanager start
$servcie clvmd start
#Create GFS ONCE in any ONE of the nodes
# -p lock_dlm is required in cluster mode. Lock_nolock is for standalone system
# -t cluster1:gfslv ( Real cluster-name: arbitrary GFS name )
# Above information is stored in GFS superblock, which can be changed with “gfs_tool sb” without re-initializing GFS e.g change lock type: "gfs_tool sb /device proto lock_nolock"
#-j 2: the number of journals, minimum 1 for each node. The default journal size is 128Mib, can be overridden by -J
#additional journal can be added with gfs_jadd
gfs_mkfs -p lock_dlm -t cluser1:gfslv -j 2 /dev/vg01/lv01
#Mount GFS in cluster member by /etc/fstab
#put GFS mount in /etc/fstab in ALL nodes
#NOTE:
#Cluster service can mount GFS without /etc/fstab after adding GFS as resource, but It can only mount on one node (the active node). Since GFS is supposed to be mounted on all nodes at the same time. /etc/fstab is a must, GFS resource is optional.
#GFS mount options: lockproto, locktable are optional, mount can obtain the information from superblock automatically
$cat /etc/fstab
/dev/vg01/lv01 /mnt/gfs gfs defaults 0 0
#Mount all GFS mounts
service gfs start
GFS command lines
####Check GFS super block
#some values can be changed by “gfs_tool sb”
$gfs_tool sb /dev/vg01/lv01 all
sb_bsize = 4096
sb_lockproto = lock_dlm
sb_locktable = cluster1:gfslv01
..
####GFS tunable parameters
#view parameters
gfs_tool gettune <mountpoint>
#set parameters
#The parameters don’t persist after re-mount, You can customize /etc/init.d/gfs to set tunable parameters on mounting
gfs_tool settune <mountpoint>
####Performance related parameters
#like other file system, you can disable access time update by mount option “noatime”
#GFS can also allow you to control how often to update access time
$gfs_tool gettune /mnt/gfs | grep atime_quantum
atime_quantum=3660 #in secs
#Disable quota, if not needed
#GFS2 remove the parameter and implement it in mount option “quota=off”
$gfs_tool settune /mnt/gfs quota_enforce 0
#GFS direct I/O
#Enable directI/O for database files, if DB has its own buffering mechanism to avoid “double” buffering
$gfs_tool setflag directio /mnt/gfs/test.1 #file attribute
$gfs_tool setflag inherit_directio /mnt/gfs/db/ #DIR attribute
$gfs_tool clearflag directio /mnt/gfs/test.1 #remove attribute
$gfs_tool stat inherit_directio /mnt/gfs/file # view attribute
#enable data journal for very small files
#disable data journal for large files
$gfs_tool setflag inherit_jdata /mnt/gfs/db/ #Enable data journal (only metadata has journal by default) on a dir. (if operate on a file, the file must be zero size)
###GFS backup, CLVM doesn't support snapshot
$gfs_tool freeze /mnt/gfs #change GFS to read-only (done once in any one of the nodes)
$gfs_tool unfreeze /mnt/gfs
###GFS repair
#after unmount GFS on all nodes
$gfs_fsck -v /dev/vg01/lv01 # gfs_fsck -v -n /dev/vg01/lv01 : -n answer no to all questions, inspect gfs only without making changes
– NFS Cluster: Because NFS is I/O bound, Why would you run Active-Active NFS cluster with CPU/memory resource in nodes are not being fully utilized?
GFS command lines
####Check GFS super block
#some values can be changed by “gfs_tool sb”
$gfs_tool sb /dev/vg01/lv01 all
sb_bsize = 4096
sb_lockproto = lock_dlm
sb_locktable = cluster1:gfslv01
..
####GFS tunable parameters
#view parameters
gfs_tool gettune <mountpoint>
#set parameters
#The parameters don’t persist after re-mount, You can customize /etc/init.d/gfs to set tunable parameters on mounting
gfs_tool settune <mountpoint>
####Performance related parameters
#like other file system, you can disable access time update by mount option “noatime”
#GFS can also allow you to control how often to update access time
$gfs_tool gettune /mnt/gfs | grep atime_quantum
atime_quantum=3660 #in secs
#Disable quota, if not needed
#GFS2 remove the parameter and implement it in mount option “quota=off”
$gfs_tool settune /mnt/gfs quota_enforce 0
#GFS direct I/O
#Enable directI/O for database files, if DB has its own buffering mechanism to avoid “double” buffering
$gfs_tool setflag directio /mnt/gfs/test.1 #file attribute
$gfs_tool setflag inherit_directio /mnt/gfs/db/ #DIR attribute
$gfs_tool clearflag directio /mnt/gfs/test.1 #remove attribute
$gfs_tool stat inherit_directio /mnt/gfs/file # view attribute
#enable data journal for very small files
#disable data journal for large files
$gfs_tool setflag inherit_jdata /mnt/gfs/db/ #Enable data journal (only metadata has journal by default) on a dir. (if operate on a file, the file must be zero size)
###GFS backup, CLVM doesn't support snapshot
$gfs_tool freeze /mnt/gfs #change GFS to read-only (done once in any one of the nodes)
$gfs_tool unfreeze /mnt/gfs
###GFS repair
#after unmount GFS on all nodes
$gfs_fsck -v /dev/vg01/lv01 # gfs_fsck -v -n /dev/vg01/lv01 : -n answer no to all questions, inspect gfs only without making changes
– NFS Cluster: Because NFS is I/O bound, Why would you run Active-Active NFS cluster with CPU/memory resource in nodes are not being fully utilized?
Recent Comments