Shan et al., 2019 - Google Patents
Accelerating the performance of modal aerosol module of E3SM using OpenACCShan et al., 2019
View PDF- Document ID
- 4834166481722355240
- Author
- Shan H
- Zhao Z
- Wagner M
- Publication year
- Publication venue
- International Workshop on Accelerator Programming Using Directives
External Links
Snippet
Using GPUs to accelerate the performance of HPC applications has recently gained great momentum. Energy Exascale Earth System Model (E3SM) is a state-of-the-science earth system model development and simulation project and has gained national recognition. It …
- 239000000443 aerosol 0 title abstract description 9
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5066—Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
- G06F8/443—Optimisation
- G06F8/4441—Reducing the execution time required by the program code
- G06F8/4442—Reducing the number of cache misses; Data prefetching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/45—Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
- G06F8/456—Parallelism detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Programme initiating; Programme switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/44—Arrangements for executing specific programmes
- G06F9/455—Emulation; Software simulation, i.e. virtualisation or emulation of application or operating system execution engines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3457—Performance evaluation by simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Abdelfattah et al. | Kblas: An optimized library for dense matrix-vector multiplication on gpu accelerators | |
| Ray et al. | XSpeed: Accelerating reachability analysis on multi-core processors | |
| Kim et al. | Performance analysis and tuning for general purpose graphics processing units (GPGPU) | |
| Kristensen et al. | Bohrium: a virtual machine approach to portable parallelism | |
| Szustak et al. | Using Intel Xeon Phi coprocessor to accelerate computations in MPDATA algorithm | |
| Borisenko et al. | A GPU parallelization of branch-and-bound for multiproduct batch plants optimization | |
| Rubin et al. | Maps: Optimizing massively parallel applications using device-level memory abstraction | |
| Levesque et al. | High performance computing: programming and applications | |
| Andión et al. | Locality-aware automatic parallelization for GPGPU with OpenHMPP directives | |
| Martínez-del-Amor et al. | Parallel simulation of Population Dynamics P systems: updates and roadmap | |
| Shan et al. | Accelerating the performance of modal aerosol module of E3SM using OpenACC | |
| Huo et al. | An execution strategy and optimized runtime support for parallelizing irregular reductions on modern gpus | |
| Davis et al. | Paradigmatic shifts for exascale supercomputing | |
| Del Monte et al. | A scalable GPU-enabled framework for training deep neural networks | |
| Wang et al. | CPU–GPU hybrid parallel strategy for cosmological simulations | |
| Shan¹ et al. | Aerosol Module of E3SM Using | |
| Celoria | Porting of DSMC to multi-GPUs using OpenACC | |
| Benz | Evaluation of GPU Acceleration for WRF–SFIRE | |
| Ahmadzadeh et al. | GPU Acceleration of LS-SVM, Based on Fractional Orthogonal Functions | |
| Valero-Lara et al. | A portable and heterogeneous LU factorization on IRIS | |
| Zhai et al. | GPU Acceleration of the FINE/FR CFD Solver in a heterogeneous environment with OpenACC Directives | |
| Tithi et al. | SU3_Bench on a Programmable Integrated Unified Memory Architecture (PIUMA) and How that Differs from Standard NUMA CPUs | |
| Jendersie et al. | A GPU parallelization of the neXtSIM-DG dynamical core (v0. 3.1) | |
| Cui | Directive-Based Data Partitioning and Pipelining and Auto-Tuning for High-Performance GPU Computing | |
| Chandran et al. | A review on GPU programming strategies and recent trends in GPU computing |