Speeding up decimal multiplication,arXiv - CS - Data Structures and Algorithms

当前位置： X-MOL 学术 › arXiv.cs.DS › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Speeding up decimal multiplication
arXiv - CS - Data Structures and Algorithms Pub Date : 2020-11-23 , DOI: arxiv-2011.11524
Viktor Krapivensky

Decimal multiplication is the task of multiplying two numbers in base $10^N.$ Specifically, we focus on the number-theoretic transform (NTT) family of algorithms. Using only portable techniques, we achieve a 3x---5x speedup over the mpdecimal library. In this paper we describe our implementation and discuss further possible optimizations. We also present a simple cache-efficient algorithm for in-place $2n \times n$ or $n \times 2n$ matrix transposition, the need for which arises in the ``six-step algorithm'' variation of the matrix Fourier algorithm, and which does not seem to be widely known. Another finding is that use of two prime moduli instead of three makes sense even considering the worst case of increasing the size of the input, and makes for simpler answer recovery.

中文翻译：

加快十进制乘法

十进制乘法是将基数$ 10 ^ N. $中的两个数字相乘的任务，特别是，我们专注于数论转换（NTT）系列算法。仅使用便携式技术，我们在mpdecimal库上实现了3x --- 5x的加速。在本文中，我们描述了我们的实现并讨论了进一步的可能优化。我们还提出了一种简单的高速缓存有效算法，用于就地$ 2n \ times n $或$ n \ times 2n $矩阵移位，矩阵傅里叶算法的``六步算法''变体中出现了这种需要，而且似乎尚未广为人知。另一个发现是，即使考虑增加输入大小的最坏情况，使用两个质数模而不是三个质数模还是有意义的，并且可以简化答案的恢复。

更新日期：2020-11-25

点击分享查看原文

点击收藏

阅读更多本刊最新论文