{"id":7058,"date":"2017-10-11T18:08:37","date_gmt":"2017-10-11T10:08:37","guid":{"rendered":"http:\/\/rmohan.com\/?p=7058"},"modified":"2017-10-11T18:08:37","modified_gmt":"2017-10-11T10:08:37","slug":"cluster-admin-interview-question","status":"publish","type":"post","link":"https:\/\/mohan.sg\/?p=7058","title":{"rendered":"Cluster Admin: Interview Question"},"content":{"rendered":"<p>Cluster Admin: Interview Question<\/p>\n<p>Cluster Administration<br \/>\n1      What is a Cluster<br \/>\nA cluster is two or more computers (called as nodes or members) that works together to perform a taks.<br \/>\n2      What are the types of cluster<br \/>\nStorage<br \/>\nHigh Availability<br \/>\nLoad Balancing<br \/>\nHigh Performance<br \/>\n3      What is CMAN<br \/>\nCMAN is Cluster Manager. It manages cluster quorum and cluster membership.<br \/>\nCMAN runs on each node of a cluster<br \/>\n4      What is Cluster Quorum<br \/>\nQuorum is a voting algorithm used by CMAN.<br \/>\nCMAN keeps a track of cluster quorum by monitoring the count of number of nodes in cluster.<br \/>\nIf more than half of members of a cluster are in active state, the cluster is said to be in Quorum<br \/>\nIf half or less than half of the members are not active, the cluster is said to be down and all cluster activities will be stopped<br \/>\nQuorum is defined as the minimum set of hosts required in order to provide service and is used to prevent split-brain situations.<br \/>\nThe quorum algorithm used by the RHCS cluster is called \u201csimple majority quorum\u201d, which means that more than half of the hosts must be online and communicating in order to provide service.<br \/>\n5      What is split-brain<br \/>\nIt is a condition where two instances of the same cluster are running and trying to access same resource at the same time, resulting in corrupted cluster integrity<br \/>\nCluster must maintain quorum to prevent split-brain issues<br \/>\n6      What is Quorum disk<br \/>\nIn case of a 2 node cluster, quorum disk acts as a tie-breaker and prevents split-brain issue<br \/>\nIf a node has access to network and quorum disk, it is active<br \/>\nIf a node has lost access to network or quorum disk, it is inactive and can be fenced<br \/>\nA Quorum disk, known as a qdisk is small partition on SAN storage used to enhance quorum. It generally carries enough votes to allow even a single node to take quorum during a cluster partition. It does this by using configured heuristics, that is custom tests, to decided which which node or partition is best suited for providing clustered services during a cluster reconfiguration.<br \/>\n7      What is RGManager<br \/>\nRGManager manages and provides failover capabilities for collections of cluster resources called services, resource groups, or resource trees.<br \/>\nIn the event of a node failure, RGManager will relocate the clustered service to another node with minimal service disruption. You can also restrict services to certain nodes, such as restricting  httpd to one group of nodes while  mysql can be restricted to a separate set of nodes.<br \/>\nWhen the cluster membership changes, openais tells the cluster that it needs to recheck it\u2019s resources. This causes rgmanager, the resource group manager, to run. It will examine what changed and then will start, stop, migrate or recover cluster resources as needed.<br \/>\nWithin rgmanager, one or more resources are brought together as a service. This service is then optionally assigned to a failover domain, an subset of nodes that can have preferential ordering.<br \/>\n8      What is Fencing<br \/>\nFencing is the disconnection of a node from the cluster\u2019s shared storage. Fencing cuts off I\/O from shared storage, thus ensuring data integrity. The cluster infrastructure performs fencing through the fence daemon,  fenced.<br \/>\nPower fencing \u2014 A fencing method that uses a power controller to power off an inoperable node.<br \/>\nstorage fencing \u2014 A fencing method that disables the Fibre Channel port that connects storage to an inoperable node.<br \/>\nOther fencing \u2014 Several other fencing methods that disable I\/O or power of an inoperable node, including IBM Bladecenters, PAP, DRAC\/MC, HP ILO, IPMI, IBM RSA II, and others.<br \/>\n9      How to manually fence an inactive node<br \/>\n# fence_ack_manual \u2013n<br \/>\n10   How to see shared IP address (Cluster Resource) if ipconfig doesn\u2019t show it<br \/>\n# ip addr list<br \/>\n11   What is DLM<br \/>\nA lock manager is a traffic cop who controls access to resources in the cluster<br \/>\nAs implied in its name, DLM is a distributed lock manager and runs in each cluster node; lock management is distributed across all nodes in the cluster. GFS2 and CLVM use locks from the lock manager.<br \/>\n12   What is Conga<br \/>\nThis is a comprehensive user interface for installing, configuring, and managing Red Hat High Availability Add-On.<br \/>\nLuci \u2014 This is the application server that provides the user interface for Conga. It allows users to manage cluster services. It can be run from outside cluster environment.<br \/>\nRicci \u2014 This is a service daemon that manages distribution of the cluster configuration. Users pass configuration details using the Luci interface, and the configuration is loaded in to corosync for distribution to cluster nodes. Luci is accessible only among cluster members.<br \/>\n13   What is OpenAis or Corosync<br \/>\nOpenAIS is the heart of the cluster. All other computers operate though this component, and no cluster component can work without it. Further, it is shared between both Pacemaker and RHCS clusters.<br \/>\nIn Red Hat clusters, openais is configured via the central cluster.conf file. In Pacemaker clusters, it is configured directly in openais.conf.<br \/>\n14   What is ToTem<br \/>\nThe totem protocol defines message passing within the cluster and it is used by openais. A token is passed around all the nodes in the cluster, and the timeout in fencing is actually a token timeout. The counter, then, is the number of lost tokens that are allowed before a node is considered dead.<br \/>\nThe totem protocol supports something called \u2018rrp\u2019, Redundant Ring Protocol. Through rrp, you can add a second backup ring on a separate network to take over in the event of a failure in the first ring. In RHCS, these rings are known as \u201cring 0? and \u201cring 1?.<br \/>\n15   What is CLVM<br \/>\nCLVM is ideal in that by using DLM, the distributed lock manager, it won\u2019t allow access to cluster members outside of openais\u2019s closed process group, which, in turn, requires quorum.<br \/>\nIt is ideal because it can take one or more raw devices, known as \u201cphysical volumes\u201d, or simple as PVs, and combine their raw space into one or more \u201cvolume groups\u201d, known as VGs. These volume groups then act just like a typical hard drive and can be \u201cpartitioned\u201d into one or more \u201clogical volumes\u201d, known as LVs. These LVs are where Xen\u2019s domU virtual machines will exist and where we will create our GFS2 clustered file system.<br \/>\n16     What is GFS2<br \/>\nIt works much like standard filesystem, with user-land tools like mkfs.gfs2, fsck.gfs2 and so on. The major difference is that it and clvmd use the cluster\u2019s distributed locking mechanism provided by the dlm_controld daemon. Once formatted, the GFS2-formatted partition can be mounted and used by any node in the cluster\u2019s closed process group. All nodes can then safely read from and write to the data on the partition simultaneously.<br \/>\n17   What is the importance of DLM<br \/>\nOne of the major roles of a cluster is to provide distributed locking on clustered storage. In fact, storage software can not be clustered without using DLM, as provided by the dlm_controld daemon and using openais\u2019s virtual synchrony via CPG.<br \/>\nThrough DLM, all nodes accessing clustered storage are guaranteed to get POSIX locks, called plocks, in the same order across all nodes. Both CLVM and GFS2 rely on DLM, though other clustered storage, like OCFS2, use it as well.<br \/>\n18   What is CCS_TOOL<br \/>\nwe can use ccs_tool, the \u201ccluster configuration system (tool)\u201d, to push the new cluster.conf to the other node and upgrade the cluster\u2019s version in one shot.<br \/>\nccs_tool update \/etc\/cluster\/cluster.conf<br \/>\n19   What is CMAN_TOOL<br \/>\nIt is a Cluster Manger tool, it can be used to view nodes and status of cluster<br \/>\nCman_tool nodes<br \/>\nCman_tool status<br \/>\n20   What is clusstat<br \/>\nClusstat is used to see what state the cluster\u2019s resources are in<br \/>\n21   What is clusvadm<br \/>\nClusvadm is a tool to manage resource in a cluster<br \/>\nclusvcadm -e  -m : Enable the  on the specified . When a  is not specified, the local node where the command was run is assumed.<br \/>\nclusvcadm -d  -m : Disable the .<br \/>\nclusvcadm -l : Locks the  prior to a cluster shutdown. The only action allowed when a  is frozen is disabling it. This allows you to stop the  so that rgmanager doesn\u2019t try to recover it (restart, in our two services). Once quorum is dissolved and the cluster is shut down, the service is unlocked and returns to normal operation next time the node regains quorum.<br \/>\nclusvcadm -u : Unlocks a , should you change your mind and decide not to stop the cluster.<br \/>\n22   What is Luci_admin init<br \/>\nThis command is run to create Luci Admin user and set password for it<br \/>\nService luci start, chckconfig luci on<br \/>\nDefault port for Luci web server is 8084<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Cluster Admin: Interview Question<\/p>\n<p>Cluster Administration 1 What is a Cluster A cluster is two or more computers (called as nodes or members) that works together to perform a taks. 2 What are the types of cluster Storage High Availability Load Balancing High Performance 3 What is CMAN CMAN is Cluster Manager. It manages cluster [&#8230;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[36],"tags":[],"_links":{"self":[{"href":"https:\/\/mohan.sg\/index.php?rest_route=\/wp\/v2\/posts\/7058"}],"collection":[{"href":"https:\/\/mohan.sg\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mohan.sg\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mohan.sg\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mohan.sg\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=7058"}],"version-history":[{"count":1,"href":"https:\/\/mohan.sg\/index.php?rest_route=\/wp\/v2\/posts\/7058\/revisions"}],"predecessor-version":[{"id":7059,"href":"https:\/\/mohan.sg\/index.php?rest_route=\/wp\/v2\/posts\/7058\/revisions\/7059"}],"wp:attachment":[{"href":"https:\/\/mohan.sg\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=7058"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mohan.sg\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=7058"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mohan.sg\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=7058"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}