
For example, let us assume you have configured a location constraint for resource
r1
to preferably run on
node1
. If it fails there,
migration-threshold
is checked
and compared to the failcount. If failcount >= migration-threshold then the resource is
migrated to the node with the next best preference.
By default, once the threshold has been reached, the node will no longer be allowed to
run the failed resource until the administrator manually resets the resource’s failcount
(after fixing the failure cause).
However, it is possible to expire the failcounts by setting the resource’s failure-timeout
option. So a setting of
migration-threshold=2
and
failure-timeout=60s
would cause the resource to migrate to a new node after two failures and potentially
allow it to move back (depending on the stickiness and constraint scores) after one
minute.
There are two exceptions to the migration threshold concept, occurring when a resource
either fails to start or fails to stop: Start failures set the failcount to INFINITY and thus
always cause an immediate migration. Stop failures cause fencing (when
stonith-enabled
is set to
true
which is the default). In case there is no STONITH
resource defined (or
stonith-enabled
is set to
false
), the resource will not mi-
grate at all.
To clean up the failcount for a resource with the Linux HA Management Client, select
Management in the left pane, select the respective resource in the right pane and click
Cleanup Resource in the toolbar. This executes the commands
crm_resource -C
and
crm_failcount -D
for the specified resource on the specified node. For more
information, see also crm_resource(8) (page 166) and crm_failcount(8) (page 157).
44
High Availability Guide
Summary of Contents for LINUX ENTERPRISE 11 - HIGH AVAILABILITY
Page 10: ......
Page 11: ...Part I Installation and Setup...
Page 12: ......
Page 28: ......
Page 38: ......
Page 39: ...Part II Configuration and Administration...
Page 40: ......
Page 68: ......
Page 108: ......
Page 114: ......
Page 115: ...Part III Storage and Data Replication...
Page 116: ......
Page 126: ......
Page 140: ......
Page 141: ...Part IV Troubleshooting and Reference...
Page 142: ......
Page 148: ......
Page 166: ...See Also cibadmin 8 page 142 156 High Availability Guide...
Page 202: ......
Page 210: ......
Page 285: ...Part V Appendix...
Page 286: ......