11/29/2023 0 Comments Simply fortran cuda support![]() ![]() There are systems that allow MPI to use pointers to data in separate device memory spaces. There are systems that allow data allocated with to be paged automatically to disparate accelerator memory spaces and device allocated memory to be paged automatically to host memory. Hardware targets include but are not limited to CPUs, SIMD units, pthreads, and Nvidia, AMD, and Intel GPUs. Footnote 6 There are many parallel programming interfaces / specifications to consider as well, including the Message Passing Interface (MPI) Footnote 7, OpenACC, Footnote 8 and OpenMP Footnote 9 in its two somewhat distinct flavors of OpenMP \(<=\) 4.0 for CPU-level threading and OpenMP target offload for accelerators with disparate memory spaces. There are quite a few compilers to consider, each capable of its own set of languages, APIs, and hardware targets, including but certainly not limited to: GNU, Footnote 1 Clang, Footnote 2 IBM, Footnote 3 Intel, Footnote 4 Cray / HPE, Footnote 5 and Nvidia (including what was formerly PGI). Further, the primary focus of this paper is on accelerated Graphics Processing Unit (GPU) hardware targets, though attention is paid to Central Processing Unit (CPU) considerations and CPU-level threading that typically maps to POSIX (Portable Operating System Interface) threads, or “pthreads”.Īlready, five to six languages have been mentioned, not to mention that C++ itself has many expressions in terms of which features are used and whether modularity is achieved largely through inheritance or template expressions. This is not to say that other languages are not important or common in scientific software but rather to set the scope of what is considered here. This paper focuses on lower-level languages commonly used in scientific software programming such as Fortran, C, C++, CUDA, HIP, and SYCL – specifically Fortran and C++. YAKL currently supports CPUs, CPU threading, and Nvidia, AMD, and Intel GPUs.Ĭode portability can be an overwhelming subject to consider because of how quickly configurations can tensor out to different coding languages / augmentations, compilers, programming interfaces, hardware targets, and system capabilities. While YAKL allows Fortran-style code, it also allows Arrays that exhibit C-like behavior as well, including row-major index ordering and lower bounds of “0”. Central to YAKL’s ability to allow Fortran-like user-level code are three features: (1) a multi-dimensional Array class that allows Fortran behavior (2) a limited library of Fortran intrinsic functions and (3) an efficient pool allocator that transparently enables cheap frequent allocations and deallocations of YAKL Arrays. YAKL places heavy emphasis on simplicity, readability, and productivity with performance mainly emphasizing Graphics Processing Units (GPUs). YAKL fills a niche capability important particularly to scientific applications seeking to port Fortran code quickly to a portable C++ library. The C++ portability approach is briefly explained, YAKL’s main features are described, and code examples are given that demonstrate YAKL’s usage. The intended audience includes both C++ developers and Fortran developers unfamiliar with C++. This paper introduces the Yet Another Kernel Launcher (YAKL) C++ portability library, which strives to enable user-level code with the look and feel of Fortran code. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |