A Unique Hastening Communication-Bound Sparse Repeatable Solver



This paper presents an organized procedure to pick this algorithmic parameter k, which supplies communication-computation compromise on hardware accelerators like FPGA and GPU. While k iterations from the iterative solver could be unrolled to provided decrease in communication cost, the extent of the unrolling is dependent around the underlying architecture, its memory model, and also the development in redundant computation. Buying and selling communication with redundant computation can boost the plastic efficiency of FPGAs and GPUs in speeding up communication-bound sparse iterative solvers. With an NVidia C2050 GPU, we demonstrate a speedup over standard iterative solvers for a variety of benchmarks which this speedup is restricted through the development in redundant computation. In comparison, for FPGAs, we produce an architecture-aware formula that limits off-nick communication but enables communication between your processing cores. This reduces redundant computation and enables large k and therefore greater speedups. Our method for FPGA supplies a speedup over same-generation GPU products where k is selected carefully for architectures for a variety of benchmarks. We offer predictive models to know this compromise and show how careful choice of k can result in performance improvement that otherwise demands significant rise in memory bandwidth.

Full Text:


Copyright (c) 2016 Edupedia Publications Pvt Ltd

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.


All published Articles are Open Access at  https://journals.pen2print.org/index.php/ijr/ 

Paper submission: ijr@pen2print.org