GPU Tensor Cores for Fast Arithmetic Reductions | IEEE Journals & Magazine | IEEE Xplore