background image

Red Hat Cluster Suite Overview

Red Hat Cluster Suite for

Red Hat Enterprise Linux

4.5

4.5

ISBN: N/A

Publication date:

Summary of Contents for CLUSTER SUITE FOR ENTERPRISE LINUX 4.5

Page 1: ...Red Hat Cluster Suite Overview Red Hat Cluster Suite for Red Hat Enterprise Linux 4 5 4 5 ISBN N A Publication date ...

Page 2: ...Red Hat Cluster Suite Overview provides an overview of Red Hat Cluster Suite for Red Hat Enterprise Linux 4 5 Red Hat Cluster Suite Overview ...

Page 3: ...ions of this document is prohibited without the explicit permission of the copyright holder Distribution of the work or derivative of the work in any standard paper book form for commercial purposes is prohibited unless prior permission is obtained from the copyright holder Red Hat and the Red Hat Shadow Man logo are registered trademarks of Red Hat Inc in the United States and other countries All...

Page 4: ...Red Hat Cluster Suite Overview ...

Page 5: ... 3 Economy and Performance 20 6 Cluster Logical Volume Manager 21 7 Global Network Block Device 25 8 Linux Virtual Server 26 8 1 Two Tier LVS Topology 28 8 2 Three Tier LVS Topology 31 8 3 Routing Methods 33 8 4 Persistence and Firewall Marks 36 9 Cluster Administration Tools 37 9 1 Conga 37 9 2 Cluster Administration GUI 40 9 3 Command Line Administration Tools 43 10 Linux Virtual Server Administ...

Page 6: ...vi ...

Page 7: ...or the data center workplace and home This document contains overview information about Red Hat Cluster Suite for Red Hat Enterprise Linux 4 5 and is part of a documentation set that provides conceptual procedural and reference information about Red Hat Cluster Suite for Red Hat Enterprise Linux 4 5 Red Hat Cluster Suite documentation and other Red Hat documents are available in HTML PDF and RPM v...

Page 8: ...s When shown as below it indicates computer output Desktop about html logs paulwesterberg png Mail backupfiles mail reports bold Courier font Bold Courier font represents text that you are to type such as service jonas start If you have to run a command as root the root prompt precedes the command gconftool 2 italic Courier font Italic Courier font represents a variable such as an installation dir...

Page 9: ... indicates an act that would violate your support agreement such as recompiling the kernel Warning A warning indicates potential data loss as may happen when tuning hardware for maximum performance 2 Feedback If you spot a typo or if you have thought of a way to make this document better we would love to hear from you Please submit a report in Bugzilla http bugzilla redhat com bugzilla against the...

Page 10: ...hich version of the guide you have If you have a suggestion for improving the documentation try to be as specific as possible If you have found an error please include the section number and some of the surrounding text so we can find it easily Introduction x ...

Page 11: ...tion 8 Linux Virtual Server Section 9 Cluster Administration Tools Section 10 Linux Virtual Server Administration GUI 1 Cluster Basics A cluster is two or more computers called nodes or members that work together to perform a task There are four major types of clusters Storage High availability Load balancing High performance Storage clusters provide a consistent file system image across servers i...

Page 12: ...nts outside the cluster Red Hat Cluster Suite provides load balancing through LVS Linux Virtual Server High performance clusters use cluster nodes to perform concurrent calculations A high performance cluster allows applications to work in parallel therefore enhancing the performance of the applications High performance clusters are also referred to as computational clusters or grid computing Note...

Page 13: ...iple nodes to share storage at a block level as if the storage were connected locally to each cluster node Cluster Logical Volume Manager CLVM Provides volume management of cluster storage Note When you create or modify a CLVM volume for a clustered environment you must ensure that you are running the clvmd daemon For further information refer to Section 6 Cluster Logical Volume Manager Global Net...

Page 14: ...ure provides the basic functions for a group of computers called nodes or members to work together as a cluster Once a cluster is formed using the cluster infrastructure you can use other Red Hat Cluster Suite components to suit your clustering needs for example setting up a cluster for sharing files on a GFS file system or setting up service failover The cluster infrastructure performs the follow...

Page 15: ...orum by monitoring the count of cluster nodes that run cluster manager In a CMAN cluster all cluster nodes run cluster manager in a GULM cluster only the GULM servers run cluster manager If more than half the nodes that run cluster manager are active the cluster has quorum If half the nodes that run cluster manager or fewer are active the cluster does not have quorum and all cluster activity is st...

Page 16: ...d mounts a GFS file system that nodes B and C have already mounted then an additional journal and lock management is required for node A to use that GFS file system If a cluster node does not transmit a heartbeat message within a prescribed amount of time the cluster manager removes the node from the cluster and communicates to other cluster infrastructure components that the node is not a member ...

Page 17: ...LM runs in nodes designated as GULM server nodes lock management is centralized in the nodes designated as GULM server nodes GULM server nodes manage locks through GULM clients in the cluster nodes refer to Figure 1 3 GULM Overview With GULM lock management operates in a limited number of nodes either one three or five nodes configured as GULM servers GFS and CLVM use locks from the lock manager G...

Page 18: ...encing device The fencing program makes a call to a fencing agent specified in the cluster configuration file The fencing agent in turn fences the node via a fencing device When fencing is complete the fencing program notifies the cluster manager Red Hat Cluster Suite provides a variety of fencing methods Power fencing A fencing method that uses a power controller to power off an inoperable node F...

Page 19: ...Figure 1 4 Power Fencing Example Fencing 9 ...

Page 20: ...ltiple paths to storage If a node has dual power supplies then the fencing method for the node must specify at least two fencing devices one fencing device for each power supply refer to Figure 1 6 Fencing a Node with Dual Power Supplies Similarly if a node has multiple paths to Fibre Channel storage then the fencing method for the node must specify one fencing device for each path to Fibre Channe...

Page 21: ...Channel Connections Figure 1 6 Fencing a Node with Dual Power Supplies Fencing 11 ...

Page 22: ... fencing methods specified in the cluster configuration file If a node fails it is fenced using the first fencing method specified in the cluster configuration file for that node If the first fencing method is not successful the next fencing method specified for that node is used If none of the fencing methods is successful then fencing starts again with the first fencing method specified and cont...

Page 23: ... file in each cluster node is up to date For example if a cluster system administrator updates the configuration file in Node A CCS propagates the update from Node A to the other nodes in the cluster refer to Figure 1 8 CCS Overview Figure 1 8 CCS Overview Other cluster components for example CMAN access configuration information from the configuration file through CCS refer to Figure 1 8 CCS Over...

Page 24: ...votes and fencing method for that node Fence Device Displays fence devices in the cluster Parameters vary according to the type of fence device For example for a power controller used as a fence device the cluster configuration defines the name of the power controller its IP address login and password Managed Resources Displays resources required to create cluster services Managed resources includ...

Page 25: ... at a time to maintain data integrity You can specify failover priority in a failover domain Specifying failover priority consists of assigning a priority level to each node in a failover domain The priority level determines the failover order determining which node that a cluster service should fail over to If you do not specify failover priority a cluster service can fail over to any node in its...

Page 26: ...on the failover domain is configured with a failover priority to fail over to node D before node A and to restrict failover to nodes only in that failover domain The cluster service comprises these cluster resources IP address resource IP address 10 10 10 201 An application resource named httpd content a web server application init script etc init d httpd specifying httpd A file system resource Re...

Page 27: ...over to node A Failover would occur with no apparent interruption to the cluster clients The cluster service would be accessible from another cluster node via the same IP address as it was before failover 5 Red Hat GFS Red Hat GFS is a cluster file system that allows a cluster of nodes to simultaneously access a block device that is shared among the nodes GFS is a native file system that interface...

Page 28: ...gical Volume Manager Red Hat GFS provides data sharing among GFS nodes in a Red Hat cluster GFS provides a single consistent view of the file system name space across the GFS nodes in a Red Hat cluster GFS allows applications to install and run without much knowledge of the underlying storage infrastructure Also GFS provides features that are typically required in enterprise environments such as q...

Page 29: ...S SAN configuration in Figure 1 12 GFS with a SAN provides superior file performance for shared files and file systems Linux applications run directly on cluster nodes using GFS Without file protocols or storage servers to slow data access performance is similar to individual Linux servers with directly connected storage yet each GFS application node has Figure 1 12 GFS with a SAN 5 2 Performance ...

Page 30: ... by network client applications File locking and sharing functions are handled by GFS for each network client Figure 1 13 GFS and GNBD with a SAN 5 3 Economy and Performance Figure 1 14 GFS and GNBD with Directly Connected Storage shows how Linux client applications can take advantage of an existing Ethernet topology to gain shared access to all block storage devices Client data files and file sys...

Page 31: ...standard LVM2 tool set and allows LVM2 commands to manage shared storage clvmd runs in each cluster node and distributes LVM metadata updates in a cluster thereby presenting each cluster node with the same view of the logical volumes refer to Figure 1 15 CLVM Overview Logical volumes created with CLVM on shared storage are visible to all nodes that have access to the shared storage CLVM allows a u...

Page 32: ...stance LVM on the shared disk as this may result in data corruption If you have any concerns please contact your Red Hat service representative Note Using CLVM requires minor changes to etc lvm lvm conf for cluster wide locking Figure 1 15 CLVM Overview You can configure CLVM using the same commands as LVM2 using the LVM graphical user interface refer to Figure 1 16 LVM Graphical User Interface or...

Page 33: ...face Figure 1 18 Creating Logical Volumes shows the basic concept of creating logical volumes from Linux partitions and shows the commands used to create logical volumes Figure 1 16 LVM Graphical User Interface Cluster Logical Volume Manager 23 ...

Page 34: ...Figure 1 17 Conga LVM Graphical User Interface Chapter 1 Red Hat Cluster Suite Overview 24 ...

Page 35: ...SCSI are not necessary or are cost prohibitive GNBD consists of two major components a GNBD client and a GNBD server A GNBD client runs in a node with GFS and imports a block device exported by a GNBD server A GNBD server runs in another node and exports block level storage from its local storage either directly attached storage or SAN storage Refer to Figure 1 19 GNBD Overview Multiple GNBD clien...

Page 36: ...er and one that is a backup LVS router The active LVS router serves two roles To balance the load across the real servers To check the integrity of the services on each real server The backup LVS router monitors the active LVS router and takes over from it in case the active LVS router fails Figure 1 20 Components of a Running LVS Cluster provides an overview of the LVS components and their interr...

Page 37: ...real server Each nanny process checks the state of one configured service on one real server and tells the lvs daemon if the service on that real server is malfunctioning If a malfunction is detected the lvs daemon instructs ipvsadm to remove that real server from the IPVS routing table If the backup LVS router does not receive a response from the active LVS router it initiates failover by calling...

Page 38: ... for data synchronization does not function optimally Therefore for real servers with a high amount of uploads database transactions or similar traffic a three tiered topology is more appropriate for data synchronization 8 1 Two Tier LVS Topology Figure 1 21 Two Tier LVS Topology shows a simple LVS configuration consisting of two tiers LVS routers and real servers The LVS router tier consists of o...

Page 39: ...ng a presence at that IP address also known as floating IP addresses VIP addresses may be aliased to the same device that connects the LVS router to the public network For instance if eth0 is connected to the Internet then multiple virtual servers can be aliased to eth0 1 Alternatively each virtual server can be associated with a separate device per service For example HTTP traffic can be handled ...

Page 40: ...e connections relative to their destination IPs This algorithm is for use in a proxy cache server cluster It routes the packets for an IP address to the server for that address unless that server is above its capacity and has a server in its half load in which case it assigns the IP address to the least loaded real server Locality Based Least Connection Scheduling with Replication Scheduling Distr...

Page 41: ...or IP packets addressed to the failed node When the failed node returns to active service the backup LVS router assumes its backup role again The simple two tier configuration in Figure 1 21 Two Tier LVS Topology is suited best for clusters serving data that does not change very frequently such as static web pages because the individual real servers do not automatically synchronize data among them...

Page 42: ...a central highly available server and accessed by each real server via an exported NFS directory or Samba share This topology is also recommended for websites that access a central high availability database for transactions Additionally using an active active configuration with Chapter 1 Red Hat Cluster Suite Overview 32 ...

Page 43: ...Routing illustrates LVS using NAT routing to move requests between the Internet and a private network Figure 1 23 LVS Implemented with NAT Routing In the example there are two NICs in the active LVS router The NIC for the Internet has a real IP address on eth0 and has a floating IP address aliased to eth0 1 The NIC for the private network interface has a real IP address on eth1 and has a floating ...

Page 44: ...S router uses network address translation to replace the address of the real server in the packets with the LVS routers public VIP address This process is called IP masquerading because the actual IP addresses of the real servers is hidden from the requesting clients Using NAT routing the real servers can be any kind of computers running a variety operating systems The main disadvantage of NAT rou...

Page 45: ...s responses directly to clients bypassing the LVS routers Direct routing allows for scalability in that real servers can be added without the added burden on the LVS router to route outgoing packets from the real server to the client which can become a bottleneck under heavy network load While there are many advantages to using direct routing in LVS there are limitations The most common issue with...

Page 46: ...te the VIP to the LVS router which will properly process the requests and send them to the real server pool This can be done by using the arptables packet filtering tool 8 4 Persistence and Firewall Marks In certain situations it may be desirable for a client to reconnect repeatedly to the same real server rather than have an LVS load balancing algorithm send that request to the best available ser...

Page 47: ...on Tools Red Hat Cluster Suite provides a variety of tools to configure and manage your Red Hat Cluster This section provides an overview of the administration tools available with Red Hat Cluster Suite Section 9 1 Conga Section 9 2 Cluster Administration GUI Section 9 3 Command Line Administration Tools 9 1 Conga Conga is an integrated set of software components that provides centralized configur...

Page 48: ...ility provides a means of replicating a luci server instance and provides an efficient upgrade and testing path When you install an instance of luci its database is empty However you can import part or all of a luci database from an existing luci server when deploying a new luci server Each luci instance has one user at initial installation admin Only the admin user may add systems to a luci serve...

Page 49: ...Figure 1 25 luci homebase Tab Figure 1 26 luci cluster Tab Conga 39 ...

Page 50: ...structure and Section 4 High availability Service Management The GUI consists of two major functions the Cluster Configuration Tool and the Cluster Status Tool The Cluster Configuration Tool provides the capability to create edit and propagate the cluster configuration file etc cluster cluster conf The Cluster Status Tool provides the capability to manage high availability services The following s...

Page 51: ...epresents cluster configuration components in the configuration file etc cluster cluster conf with a hierarchical graphical display in the left panel A triangle icon to the left of a component name indicates that the component has one or more subordinate components assigned to it Clicking the triangle icon expands and collapses the portion of the tree below a component The components displayed in ...

Page 52: ...main properties when a failover domain is selected Resources For configuring shared resources to be used by high availability services Shared resources consist of file systems IP addresses NFS mounts and exports and user created scripts that are available to any high availability service in the cluster Resources are represented as subordinate elements under Resources Using configuration buttons at...

Page 53: ...re determined by the cluster configuration file etc cluster cluster conf You can use the Cluster Status Tool to enable disable restart or relocate a high availability service 9 3 Command Line Administration Tools In addition to Conga and the system config cluster Cluster Administration GUI command line tools are available for administering the cluster infrastructure and the high availability Comma...

Page 54: ...re gulm_tool is a program used to manage GULM It provides an interface to lock_gulmd the GULM lock manager gulm_tool is available with GULM clusters only For more information about this tool refer to the gulm_tool 8 man page fence_tool Fence Tool Cluster Infrastructure fence_tool is a program used to join or leave the default fence domain Specifically it starts the fence daemon fenced to join the ...

Page 55: ...ly with either the hostname or the real IP address followed by 3636 If you are accessing the Piranha Configuration Tool remotely you need an ssh connection to the active LVS router as the root user Starting the Piranha Configuration Tool causes the Piranha Configuration Tool welcome page to be displayed refer to Figure 1 30 The Welcome Panel Logging in to the welcome page provides access to the fo...

Page 56: ... default value is 10 seconds It is not recommended that you set the automatic update to an interval less than 10 seconds Doing so may make it difficult to reconfigure the Auto update interval because the page will update too frequently If you encounter this issue simply click on another panel and then back on CONTROL MONITORING Update information now Provides manual update of the status informatio...

Page 57: ... 32 The GLOBAL SETTINGS Panel The top half of this panel sets up the primary LVS router s public and private network interfaces Primary server public IP The publicly routable real IP address for the primary LVS node Primary server private IP The real IP address for an alternative network interface on the primary LVS node This address is used solely as an alternative heartbeat channel for the backu...

Page 58: ...ld be used as the gateway for the real servers NAT Router netmask If the NAT router s floating IP needs a particular netmask select it from drop down list NAT Router device Defines the device name of the network interface for the floating IP address such as eth1 1 10 3 REDUNDANCY The REDUNDANCY panel allows you to configure of the backup LVS router node and set various heartbeat monitoring options...

Page 59: ...S node Assume dead after seconds If the primary LVS node does not respond after this number of seconds then the backup LVS router node will initiate failover Heartbeat runs on port Sets the port at which the heartbeat communicates with the primary LVS node The default is set to 539 if this field is left blank 10 4 VIRTUAL SERVERS The VIRTUAL SERVERS panel displays information for each currently de...

Page 60: ...s radio button and click the DE ACTIVATE button After adding a virtual server you can configure it by clicking the radio button to its left and clicking the EDIT button to display the VIRTUAL SERVER subsection 10 4 1 The VIRTUAL SERVER Subsection The VIRTUAL SERVER subsection panel shown in Figure 1 35 The VIRTUAL SERVERS Subsection allows you to configure an individual virtual server Links to sub...

Page 61: ...erver This name is not the hostname for the machine so make it descriptive and easily identifiable You can even reference the protocol used by the virtual server such as HTTP Application port The port number through which the service application will listen Protocol Provides a choice of UDP or TCP in a drop down menu Virtual IP Address The virtual server s floating IP address VIRTUAL SERVERS 51 ...

Page 62: ...l server node comes online the least connections table is reset to zero so the active LVS router routes requests as if all the real servers were freshly added to the cluster This option prevents the a new server from becoming bogged down with a high number of connections upon entering the cluster Load monitoring tool The LVS router can monitor the load on the various real servers by using either r...

Page 63: ...tatus of the physical server hosts for a particular virtual service Figure 1 36 The REAL SERVER Subsection Click the ADD button to add a new server To delete an existing server select the radio button beside it and click the DELETE button Click the EDIT button to load the EDIT REAL SERVER panel as seen in Figure 1 37 The REAL SERVER Configuration Panel VIRTUAL SERVERS 53 ...

Page 64: ...e name for the real server Tip This name is not the hostname for the machine so make it descriptive and easily identifiable Address The real server s IP address Since the listening port is already specified for the associated virtual server do not add a port number Chapter 1 Red Hat Cluster Suite Overview 54 ...

Page 65: ...s the administrator to specify a send expect string sequence to verify that the service for the virtual server is functional on each real server It is also the place where the administrator can specify customized scripts to check services requiring dynamically changing data Figure 1 38 The EDIT MONITORING SCRIPTS Subsection Sending Program For more advanced service verification you can use this fi...

Page 66: ...u can alter this value depending on your needs If you leave this field blank the nanny daemon attempts to open the port and assume the service is running if it succeeds Only one send sequence is allowed in this field and it can only contain printable ASCII characters as well as the following escape characters n for new line r for carriage return t for tab to escape the next character which follows...

Page 67: ...nd used to manage cluster configuration in a graphical setting Cluster Logical Volume Manager CLVM clvmd The daemon that distributes LVM metadata updates around a cluster It must be running on all nodes in the cluster and will give an error if a node in the cluster does not have this daemon running lvm LVM2 tools Provides the command line tools for LVM2 system config lvm Provides graphical user in...

Page 68: ...ch as votes libcman so version number Library for programs that need to interact with cman ko Resource Group Manager rgmanager clusvcadm Command used to manually enable disable relocate and restart user services in a cluster clustat Command used to display the status of the cluster including node membership and services running clurgmgrd Daemon used to handle user service requests including servic...

Page 69: ... Fence agent used with GNBD storage fence_scsi I O fencing agent for SCSI persistent reservations fence_egenera Fence agent used with Egenera BladeFrame system fence_manual Fence agent for manual interaction NOTE This component is not supported for production environments fence_ack_manual User interface for fence_manual agent fence_node A program which performs I O fencing on a single node fence_x...

Page 70: ...ates a GFS file system on a storage device gfs_quota Command that manages quotas on a mounted GFS file system gfs_tool Command that configures or tunes a GFS file system This command can also gather a variety of information about the file system mount gfs Mount helper called by mount 8 not used by user lock_harness ko Implements a pluggable lock module interface for GFS that allows for a variety o...

Page 71: ...is started by the etc rc d init d pulse script It then reads the configuration file etc sysconfig ha lvs cf On the active LVS router pulse starts the LVS daemon On the backup router pulse determines the health of the active router by executing a simple heartbeat at a user configurable interval If the active LVS router fails to respond after a user configurable interval it initiates failover During...

Page 72: ... A separate process runs for each service defined on each real server lvs cf This is the LVS configuration file The full path for the file is etc sysconfig ha lvs cf Directly or indirectly all daemons get their configuration information from this file Piranha Configuration Tool This is the Web based tool for monitoring configuring and administering LVS This is the default tool to maintain the etc ...

Page 73: ...ual I O Fencing fence_apc 8 I O Fencing agent for APC MasterSwitch fence_bladecenter 8 I O Fencing agent for IBM Bladecenter fence_brocade 8 I O Fencing agent for Brocade FC switches fence_bullpap 8 I O Fencing agent for Bull FAME architecture controlled by a PAP management console fence_drac 8 fencing agent for Dell Remote Access Card fence_egenera 8 I O Fencing agent for the Egenera BladeFrame f...

Page 74: ...t clusvcadm 8 Cluster User Service Administration Utility clustat 8 Cluster Status Utility Clurgmgrd clurgmgrd 8 Resource Group Cluster Service Manager Daemon clurmtabd 8 Cluster NFS Remote Mount Table Daemon GFS gfs_fsck 8 Offline GFS file system checker gfs_grow 8 Expand a GFS filesystem gfs_jadd 8 Add journals to a GFS filesystem gfs_mount 8 GFS mount options gfs_quota 8 Manipulate GFS disk quo...

Page 75: ...Hat clustering services ipvsadm 8 Linux Virtual Server administration ipvsadm restore 8 restore the IPVS table from stdin ipvsadm save 8 save the IPVS table to stdout nanny 8 tool to monitor status of services in a cluster send_arp 8 tool to notify network of a new IP address MAC address mapping 3 Compatible Hardware For information about hardware that is compatible with Red Hat Cluster Suite comp...

Page 76: ...66 ...

Page 77: ... routing requirements hardware 34 requirements network 34 requirements software 34 routing methods NAT 33 three tiered high availability cluster 31 N NAT routing methods LVS 33 network address translation see NAT O overview economy 18 performance 18 scalability 18 P Piranha Configuration Tool CONTROL MONITORING 45 EDIT MONITORING SCRIPTS Subsection 55 GLOBAL SETTINGS 47 login panel 45 necessary so...

Page 78: ...68 ...

Reviews: