HPC Correctness and Scheduling

Correctness checking
This is a short overview over the basic concepts of Correctness Checking (debugging).
General
The common approach of trial and error is usually not very efficient with large programs. It is
easy to lose track of what has been tested and adjusted and small changes are easily forgotten
when trying to recover an earlier state. It is therefore advisable to tackle errors and warnings
systematically. A good first approach is always to keep a logbook (which can be a simple
textfile) and write down exactly what error message came up, if and how it was reproducible,
and actions taken to understand what happened (e.g. core file). Bug tracking systems (e.g.
Bugzilla, trac) can also be helpful.
While debugging it is also recommended to simplify the program, i.e. reducing the input size,
reducing the number of processes etc. and also reducing the number of compiler warnings.
These test cases can be reused when making changes later on so you might want to keep
them. Please note that when using parallel programming, it is important to keep a sufficiently
large number of processors, as issues like data races will not occur otherwise.
Different testing methods have been established for developing and altering a program, two of
which are unit and regression testing. During unit testing the program is broken down into
multiple smallest individually testable parts to simplify checking them. Regression testing is of
importance when altering fully functional programs, as it is often not sufficient to check only the
parts directly affected by the changes. Upon locating and fixing bugs, it is advisable to keep the
tests involved in the process to repeatedly check for similar bugs after making changes.
It is also advisable to use a source control manager (e.g. git, svn) to be able to recover an
earlier state of the program.
Common issues
Small mistakes can create very interesting error messages that are often also completely
unrelated to the point where they actually happened. A list of things you may want to look out
for:
 Are all variables initialized?

 Are there unused variables? (written but never read)
 Is there a part in the code that is never reached? (e.g. broken if-statement)
 Beware of pointers
 What are the defaults on the system you are using? (e.g. stack size too small)
Debugging Tools
There are usually various debugging tools available on a cluster, which can mostly be divided
into two types, interpretive and direct execution. The former more or less works on the source
code and machine code level and simulates parts of the program while the latter is attached to
the program and monitors the internal state of it during runtime. The most common strategies
are line-by-line execution or the use of breakpoints to skip the monitoring of longer and
irrelevant parts.
Tools that utilize direct execution / dynamic analysis are MUST for programs parallelized
with MPI, or the GNU command-line debugger GDB where you can set breakpoints and look
into the source code that's being executed.
Scheduling Basics
This is an overview over the basic concepts/goals of a scheduler. An overview that HPC
centers uses which scheduler can be found at https://hpc-wiki.info/hpc/Schedulers. Further
Information about Batch-Schedulers are available, as well as specific information about the
schedulers SLURM, LSF and Torque.
General
Schematic of how users can access the batch system
A scheduler is software that implements a batch system on a HPC (cluster). Users do not run
their calculations directly and interactively (as they do on their personal workstations or
laptops), instead they submit non-interactive batch jobs to the scheduler.
The scheduler stores the batch jobs, evaluate their resource requirements and priorities, and
distributes the jobs to suitable compute nodes. These workhorses make up the majority of HPC
clusters (about 98%), being their most powerful, but also the most power consuming parts.
In contrast to the login nodes (for compiling and testing user software) and their interactive
usage, these compute nodes are usually not directly accessible (via ssh ).
The scheduler is thus the interface for the users on the login nodes to send work to the
compute nodes.
This requires the user to ask the scheduler for time and memory resources and to specify the
application inside a jobscript.
This jobscript can then be submitted to the batch system via the scheduler, which will first add
the job to a job queue. Based on the resources the job needs, the scheduler will decide when
the job will leave the queue, and on which (part of the) back-end nodes it will run.
Be careful about the resources you request and know your system's limits. For example, if you
 Demand less time than your job actually needs to finish, the scheduler will simply kill the
job once the time allocated is up.
 Specify more memory than there is available on the system, your job might be stuck in the
queue forever.
Purpose
Generally speaking, every scheduler has three main goals:
 minimize the time between the job submission and finishing the job: no job should stay in
the queue for extensive periods of time
 optimize CPU utilization: the CPUs of the supercomputer are one of the core resources for
a big application; therefore, there should only be few time slots where a CPU is not
working
 maximize the job throughput: manage as many jobs per time unit as possible
Illustration
Schematic of how a scheduler may distribute jobs onto nodes
Assuming that the batch system you are using consists of 6 nodes, this is how the scheduler
could place the nine jobs in the queue onto the available nodes. The goal is to eliminate
wasted resources, which can be identified on the right by looking at the free areas depicting
nodes without any job execution on them. Therefore, the jobs may not be distributed among
the nodes in the same order in which they first entered the queue. The space that a job takes
up is determined by the time and number of nodes required for executing it.
Scheduling Algorithms
There are two very basic strategies that schedulers can use to determine which job to run next.
Note that modern schedulers do not stick strictly to just one of these algorithms, but rather
employ a combination of the two. Besides, there are many more aspects a scheduler has to
take into consideration, e.g. the current system load.
First Come, First Serve
Jobs are run in the exact same order in which they first enter the queue. The advantage is that
every job will definitely be run; however, very tiny jobs might wait for an inadequately long time
compared to their actual execution time.
Shortest Job First
Based on the execution time declared in the jobscript, the scheduler can estimate how long it
will take to execute the job. Then, the jobs are ranked by that time from shortest to longest.
While short jobs will start after a short waiting time, long running jobs (or at least jobs declared
as such) might never actually start.
Backfilling
When backfilling the scheduler maintains the concept of "First Come, First Serve" without
preventing long running jobs to execute. The scheduler checks whether the first job in the
queue can be executed. If that is true, the job is executed without further delay. But if not, the
scheduler goes through the rest of the queue to check whether another job can be executed
without extending the waiting time of the first job in queue. If it finds such a job, the scheduler
simply runs the job. Since jobs, which only need a few compute resources, are easily
"backfillable", small jobs will usually encounter short queue times.

HPC Correctness and Scheduling

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

HPC Correctness and Scheduling

Caricato da

Copyright:

Formati disponibili

Correctness checking

 Are all variables initialized?

Schematic of how users can access the batch system

Schematic of how a scheduler may distribute jobs onto nodes

Potrebbero piacerti anche