Introduction to Parallelisation
|
Parallelization involves dividing complex computational tasks into smaller subtasks that can be executed simultaneously, leading to improved performance and reduced execution times.
Parallel programming enables algorithms to leverage parallel hardware architectures, such as multi-core CPUs and GPUs, to achieve faster computations.
OpenMP is an essential parallel programming API that simplifies the development of multithreaded programs by incorporating compiler directives, library routines, and environment variables in a portable manner.
Shared memory parallelism is a fundamental concept in OpenMP, where threads share memory and work together to solve computational problems effectively.
|
Introduction to OpenMP
|
OpenMP is an industry-standard API for parallel programming in shared memory environments.
It supports C, C++, and Fortran and is governed by the OpenMP ARB.
OpenMP follows the fork-join model, using master and slave threads for parallel tasks.
Compiler directives guide the compiler to create parallel code, e.g., #pragma omp parallel .
Runtime library routines offer predefined functions for thread control and synchronization.
Environment variables fine-tune OpenMP runtime behavior.
To compile and run OpenMP programs, include the <omp.h> header, compile with -fopenmp flag, and execute the compiled binary.
OpenMP is user-friendly, automating thread distribution across systems.
Both OpenMP and Low-Level Threading APIs provide effective parallel programming options. The choice depends on factors like ease of use, control, and performance optimization.
|
Writing Parallel Applications with OpenMP
|
Use #pragma omp parallel to define a parallel code section
There are two types of variable scoping for parallel regions - shared (variables are shared across threads) and private (threads have their own copy of a variable separate to those of other threads).
To avoid ambiguous code behaviour, it is good practice to explicitly default to a none variable sharing policy between thread, and define exceptions explicitly.
Using #pragma omp parallel for is a shorter way of defining an omp parallel section and a omp parallel for within it.
Using the library functions omp_get_num_threads() and omp_get_thread_num() outside of a parallel region will return 1 and 0 respectively.
There are 5 different scheduling methods - static, dynamic, guided, auto, and runtime.
We can use the OMP_SCHEDULE environment variable to define a scheduler and chunk size that is used by the runtime scheduler.
|
Synchronisation and Race Conditions
|
Synchronising threads is important to ensure data consistency and code correctness
A race condition happens when multiple threads try to access and modify the same piece of data at the same time
OpenMP has many synchronisation mechanisms which are used to coordinate threads
Atomic operations, locks and critical regions can be used to prevent race conditions
|
Introduction to Hybrid Parallelism
|
Hybrid parallelism is the combination of two or more different parallelisation schemes
One of the most forms of hybrid parallelism is to use MPI with OpenMP
The main advantages of a hybrid MPI+OpenMP approach are reduced memory usage and improved scaling and load balancing
However, this comes at the cost of increased overheads, code complexity and potentially limited portability
In MPI+OpenMP, a common approach is to split up jobs across MPI processes and to parallelise job execution using OpenMP
|
Survey
|
|
{:auto_ids}
key word 1
: explanation 1
key word 2
: explanation 2