ANALYSIS OF CUDA EFFICIENCY IN SOLVING LINEAR TRIDIAGONAL SYSTEMS FOR THEORETICAL OPTION PRICING
ITMO University; Post-Graduate Student
Y. A. Shpolyanskiy
ITMO University; Professor
M. . Kosyakov
ITMO University; Department of Computer Technology;
Abstract. Parallel cyclic reduction method for solving linear tridiagonal systems is implemented on GPU. The advisability of matrix formation directly in GPU global memory is shown. The approach provides a more than 20-fold acceleration as compared to single-threaded calculation. With the account for data transfer between RAM and GPU, a 5—8-fold acceleration is attained with the use of mapped memory.
Keywords: CUDA, GPGPU, systems of linear equations, parallel cyclic reduction, PCR, sweep method.