.. _using_taskfarmer: **************************************** Leveraging full nodes using |Taskfarmer| **************************************** .. secauthor:: Thomas Alexander Gaƫl Donval When you submit a job to |Balena|, the submission script contains instructions requesting a set of computer resources for your calculation. These include the maximum execution time and number of machines (known as nodes) required. In general, |hpc| (*high performance computing*) clusters such as |Balena| are tuned for submissions that take advantage of many processors. Submissions such as these are referred to as parallel jobs and in our case, nodes are not shared between users. In other words, the minimum allocation on Balena is a full node comprising 16 separate CPU cores. This is problematic for programs such as |Music| and |Raspa| which can't natively make use of many |cores| at once: we are going to see in this section how to work around that limitation with a program called |Taskfarmer|. Serial *vs.* parallel programs ------------------------------ In the realm of |hpc|, there are only two main categories of software: *serial* programs ---designed to run on a single |core|--- and *parallel* programs ---designed to run on multiple |cores| on multiple nodes at once. Running a program in parallel rather than in serial generally reduces execution time by splitting a fixed number of calculations on different |cpus|: without this, some calculation would take weeks or months on a single |core|. There is a catch though: which category any particular program belongs to is not for you to choose. Parallel programs can be run in parallel because of the way they were programmed; some problems cannot even be parallelised at all no matter how they are solved! |Music| and |Raspa| for instance are serial programs: no matter how many nodes or |cores| you allocate, no matter whether or not you prepend them with :command:`mpiexec` (more on that later), no matter how many such instructions you put in your submission script, they are going to use one and only one |core| out of all the |cores| you allocated, wasting all the others. Running serial programs on the cluster -------------------------------------- The second catch is that you cannot allocate *fewer* than 16 |cores| at once on |Balena| by design. There is absolutely no way around this. There is also no safeguard: if you are not careful and simply submit a serial job on |Balena|, nothing will prevent you from doing so but you will be wasting 15 |cores| in the process. Such a behaviour will however automatically be penalised later on by the scheduler according to fair-use rules: at best, you'll need to wait for days to get your jobs running and you won't get much done. The way around this is to resort to what is often called *task farming*: a list of serial tasks is given to a special program; that special program will assign serial jobs to each of the |cores| on a node. |Taskfarmer| is such a program: if you provide it with at least 16 serial tasks (e.g. 16 different |Music| calculations), it will execute those 16 tasks at the same time using all the available |cores| keeping them all busy. This requires a little bit of planning but it is worth it. Making the program available ---------------------------- As usual on |Balena|, you must make programs available by loading them. If it is not already done, load the ``ce-molsim`` repository: .. code-block:: console $ module purge $ module load group ce-molsim stack In any case, load |Taskfarmer|: .. code-block:: console $ module load taskfarmer Now |Taskfarmer| should be available (watch out for error messages). You can confirm this by typing the following: .. code-block:: console $ which taskfarmer /apps/group/ce-molsim/taskfarmer/bin/taskfarmer If something fails, the previous command should fail very loudly telling you where it was looking for it in the process. .. hint:: Every time you log on |Balena| and for every job you want to submit, you need to go through the module loading process all over again. There is a way to make these changes permanent though: you can edit your :file:`~/.bashrc` file (which contains your command prompt settings) and simply add the module purging/loading commands at the end of it: .. code-block:: console $ nano ~/.bashrc The next time you'll log on |Balena|, all the specified modules should be readily available once for all. Usage ----- As with most command line programs, you can get the usage (or synopsis) of the command by passing it the ``--help`` flag: .. code-block:: console $ taskfarmer --help TaskFarmer - a simple task farmer for running serial tasks with mpiexec. Usage: mpiexec -n CORES taskfarmer [-h] -f FILE [-v] [-w] [-r] [-s SLEEP_TIME] [-m MAX_RETRIES] Available options: -h/--help : Print this help information -f/--file : Location of task file (required) -v/--verbose : Print status updates to stdout -w/--wait-on-idle : Wait for more tasks when idle -r/--retry : Retry failed tasks -s/--sleep-time : Sleep duration when idle (seconds) -m/--max-retries : Maximum number of retries for failed tasks In our case, this means that when we write our submission script to the scheduler, |Taskfarmer| should be invoked that way: .. code-block:: console $ mpiexec -n 16 taskfarmer -f Again, square brackets denote optional arguments. Here ``mpiexec`` is a reference to another program which purpose is to provide the number of |cores| |taskfarmer| is allowed to use. The ```` is a simple text file containing a list of serial calculations that |Taskfarmer| is supposed to run concurrently on all the 16 |cores|. Writing a task file ------------------- You task file should look something like this: .. code-block:: text bash ~/scratch/Simulations/01/run bash ~/scratch/Simulations/02/run bash ~/scratch/Simulations/03/run bash ~/scratch/Simulations/04/run bash ~/scratch/Simulations/05/run bash ~/scratch/Simulations/06/run bash ~/scratch/Simulations/07/run bash ~/scratch/Simulations/08/run bash ~/scratch/Simulations/09/run bash ~/scratch/Simulations/10/run bash ~/scratch/Simulations/11/run bash ~/scratch/Simulations/12/run bash ~/scratch/Simulations/13/run bash ~/scratch/Simulations/14/run bash ~/scratch/Simulations/15/run bash ~/scratch/Simulations/16/run Here we are supposing that you have fully set up your different |Music| or |Raspa| simulations in each :file:`~/scratch/Simulations/` directory and that any specific simulation can be run by calling: .. code-block:: console $ bash ~/scratch/Simulations//run .. note:: We merely give that pattern as an example: you don't have to conform to it, you just need to understand what they represent and do the translation to your own pattern yourself. Submitting the job ------------------ You need to write a submission script for |Balena|'s scheduler (again, use :command:`nano` to edit the file): .. code-block:: bash #!/usr/bin/env bash #SBATCH --job-name=test #SBATCH --partition=batch #SBATCH --account=free #SBATCH --nodes=1 #SBATCH --ntasks-per-node=16 #SBATCH --time=06:00:00 #SBATCH --export=ALL module purge module load group ce-molsim stack module load taskfarmer mpiexec -n 16 taskfarmer -f Here, ```` is the full path leading to your task file, you only request one node (there's no point in requesting more) and it's going to run for 6 hours maximum. To actually submit the job, type: .. code-block:: console $ sbatch