
●
For cluster, stripe, and atomic stripe parallel modes, add the
failover
option to the mount line
or
/etc/fstab
entry to specify failover and failback.
●
For loadbalance mode, failover and failback are specified by default.
DVS failover and failback are done in an active-active manner. Multiple servers must be specified in
the
/etc/fstab
entry for failover and failback to function. When a server fails, it is taken out of the list of servers
to use for the mount point until it is rebooted. All open and new files use the remaining servers as described by
the cluster, stripe, and atomic stripe parallel sections. Files not using the failed server are not affected.
When failover occurs:
●
If all servers fail, I/O is retried as described by the
retry
DVS Client Mount Point Options
page 143).
●
Any mount point using loadbalance mode automatically recalibrates the existing client-to-server routes to
ensure that the clients are evenly distributed across the remaining servers. When failback occurs, this
process is repeated.
●
Any mount point using cluster parallel mode automatically redirects I/O to one of the remaining DVS servers
for any file that previously routed to the now-down server. When failback occurs, files are rerouted to their
original server.
●
Any mount point using stripe parallel mode or atomic stripe parallel mode automatically restripes I/O across
the remaining DVS servers in an even manner. When failback occurs, files are restriped to their original
pattern.
Client System Console Message: "DVS: file_node_down: removing from list of available servers
for 2 mount points"
The following message indicates that a DVS server has failed.
DVS: file_node_down: removing r0s1c1n3 from list of available
servers for 2 mount points
In this example,
r0s1c1n3
is the DVS server that has failed and has been removed from the list
of available mount points provided in the
/etc/fstab
entry for the DVS projection.
After the issue is resolved, the following message is printed to the console log of each client of
the projection:
DVS: file_node_up: adding r0s1c1n3 back to list of available servers
for 2 mount points
6.1.6.2 About DVS Periodic Sync
DVS periodic sync improves data resiliency and facilitates a degree of application resiliency so that applications
may continue executing in the event of a stalled file system or DVS server failure. Without periodic sync, such an
event would result in DVS clients killing any processes with open files that were written through the failed server.
Any data written through that server that was only in the server's page cache and not written to disk would be lost,
and processes using the file would see data corruption.
Periodic sync works by periodically performing
fsync
on individual files with dirty pages on the DVS servers, to
ensure those files are written to disk. For each file, the DVS client tracks when a DVS server performs a file sync
and when processes on DVS clients write to it, and then notifies the DVS server when
fsync
is needed.
Cray DVS
S3016
155