ANALYSIS OF CUDA EFFICIENCY IN SOLVING LINEAR TRIDIAGONAL SYSTEMS FOR THEORETICAL OPTION PRICING
ITMO University; Post-Graduate Student
Y. A. Shpolyanskiy
ITMO University; Professor
M. S. Kosyakov
ITMO University; Department of Computer Technology;
Read the full article
Abstract. Parallel cyclic reduction method for solving linear tridiagonal systems is implemented on GPU. The advisability of matrix formation directly in GPU global memory is shown. The approach provides a more than 20-fold acceleration as compared to single-threaded calculation. With the account for data transfer between RAM and GPU, a 5—8-fold acceleration is attained with the use of mapped memory.
Keywords:
CUDA, GPGPU, systems of linear equations, parallel cyclic reduction, PCR, sweep method.