Journal Article: Evaluating the Performance of OpenMP Offloading on the NEC SX-Aurora TSUBASA Vector Engine

The NEC SX-Aurora TSUBASA vector engine (VE) follows the tradition of long vector processors for high-performance computing (HPC). The technology combines the vector computing capabilities with the popularity of standard x86 architecture by integrating it as an accelerator. To decrease the burden of code porting for different accelerator types, the OpenMP specification is designed to be single parallel programming model for all of them. Besides the availability of compiler and runtime implementations, the functionality as well as the performance is important for the usability and acceptance of this paradigm.

Details ...

Bachelor Thesis: Analysis, optimization and application of GPGPU-accelerated high-intensity focused ultrasound simulations including heterogeneous tissue properties

NVIDIA has incorporated Tensor Cores into its newer generations of graphics cards. These enable the acceleration of matrix-matrix multiplications, which are popular in data science and simulation applications. We are using this technology to accelerate an interactive HIFU simulation software to minimize treatment time for patients under anaesthesia. HIFU is a new, modern, and non-invasive therapeutic approach for treating a variety of pathologies. Many of these applications require simulation to avoid risks and unnecessary damage.

Details ...

Conference Paper: OpenMP Target Device Offloading for the SX-Aurora TSUBASA Vector Engine

Driven by the heterogeneity trend in modern supercomputers, OpenMP provides support for heterogeneous systems since 2013. Having a single programming model for all kinds of accelerator-based systems decreases the burden of code porting to different device types. The acceptance of this heterogeneous paradigm requires the availability of corresponding OpenMP compiler and runtime environments supporting different target device architectures. The LLVM/Clang infrastructure is designated to extend the offloading features for any new target platform.

Details ...