
8.14 Remove Temporary Spark Files from SSDs
Prerequisites
This procedure requires root privileges.
About this task
Spark writes temporary files to the SSDs of the compute nodes that the Spark executors run on. Ordinarily, these
temporary files are cleaned up by Spark when its execution completes. However, sometimes Spark may fail to
fully clean up its temporary files, such as, when the Spark executors are not shut down correctly. If this happens
too many times, or with very large temporary files, the SSDs may begin to fill up. This can cause Spark jobs to fail
or slow down.
Urika-GX checks for any idle nodes once per hour, and cleans up any left over temporary files. This is handled by
a cron job running on one of the login nodes that executes the
/usr/sbin/cleanupssds.sh
script once per
hour. Follow the instructions in this procedure if this automated clean up ever proves to be insufficient.
Procedure
1. Log on to one of the login nodes as root.
2. Kill all the processes of running Spark jobs.
3. Execute the
/usr/sbin/cleanupssds.sh
script.
#
/usr/sbin/cleanupssds.sh
Troubleshooting
S3016
272