• arXiv.cs.MS Pub Date : 2019-09-30
Peter Bastian; Markus Blatt; Andreas Dedner; Nils-Arne Dreier; Christian Engwer; René Fritze; Carsten Gräser; Christoph Grüninger; Dominic Kempf; Robert Klöfkorn; Mario Ohlberger; Oliver Sander

This paper presents the basic concepts and the module structure of the Distributed and Unified Numerics Environment and reflects on recent developments and general changes that happened since the release of the first Dune version in 2007 and the main papers describing that state [1, 2]. This discussion is accompanied with a description of various advanced features, such as coupling of domains and cut

更新日期：2020-04-08
• arXiv.cs.MS Pub Date : 2020-04-03
Jonas Klappert; Sven Yannick Klein; Fabian Lange

We present the main improvements and new features in version $\texttt{2.0}$ of the open-source $\texttt{C++}$ library $\texttt{FireFly}$ for the interpolation of rational functions. This includes algorithmic improvements, e.g. a hybrid algorithm for dense and sparse rational functions and an algorithm to identify and remove univariate factors. The new version is applied to a Feynman-integral reduction

更新日期：2020-04-06
• arXiv.cs.MS Pub Date : 2020-04-02
Anastasia A. Funkner; Aleksey N. Yakovlev; Sergey V. Kovalchuk

The paper proposes an approach for surrogate-assisted tuning of knowledge discovery algorithms. The approach is based on the prediction of both the quality and performance of the target algorithm. The prediction is furtherly used as objectives for the optimization and tuning of the algorithm. The approach is investigated using clinical pathways (CP) discovery problem resolved using the evolutionary-based

更新日期：2020-04-03
• arXiv.cs.MS Pub Date : 2020-03-28
Jean-Matthieu Gallard; Leonhard Rannabauer; Anne Reinarz; Michael Bader

We present a sequence of optimizations to the performance-critical compute kernels of the high-order discontinuous Galerkin solver of the hyperbolic PDE engine ExaHyPE -- successively tackling bottlenecks due to SIMD operations, cache hierarchies and restrictions in the software design. Starting from a generic scalar implementation of the numerical scheme, our first optimized variant applies state-of-the-art

更新日期：2020-03-31
• arXiv.cs.MS Pub Date : 2020-03-28
Stephan Hageboeck; Lorenzo Moneta

RooFit and RooStats, the toolkits for statistical modelling in ROOT, are used in most searches and measurements at the Large Hadron Collider. The data to be collected in Run 3 will enable measurements with higher precision and models with larger complexity, but also require faster data processing. In this work, first results on modernising RooFit's collections, restructuring data flow and vectorising

更新日期：2020-03-31
• arXiv.cs.MS Pub Date : 2020-03-28
Stephan Hageboeck

RooFit and RooStats, the toolkits for statistical modelling in ROOT, are used in most searches and measurements at the Large Hadron Collider as well as at $B$ factories. Larger datasets to be collected in e.g. the LHC's Run 3 will enable measurements with higher precision, but will require faster data processing to keep fitting times stable. In this work, a simplification of RooFit's interfaces and

更新日期：2020-03-31
• arXiv.cs.MS Pub Date : 2019-11-15
Jean-Matthieu Gallard; Lukas Krenz; Leonhard Rannabauer; Anne Reinarz; Michael Bader

The development of a high performance PDE solver requires the combined expertise of interdisciplinary teams with respect to application domain, numerical scheme and low-level optimization. In this paper, we present how the ExaHyPE engine facilitates the collaboration of such teams by isolating three roles: application, algorithms, and optimization expert. We thus support team members in letting them

更新日期：2020-03-31
• arXiv.cs.MS Pub Date : 2020-03-23
Nir Goren; Dan Halperin; Sivan Toledo

We show how to efficiently solve a clustering problem that arises in a method to evaluate functions of matrices. The problem requires finding the connected components of a graph whose vertices are eigenvalues of a real or complex matrix and whose edges are pairs of eigenvalues that are at most \delta away from each other. Davies and Higham proposed solving this problem by enumerating the edges of the

更新日期：2020-03-28
• arXiv.cs.MS Pub Date : 2020-03-26
Georg Grasegger; Jan Legerský

In this paper we present the SageMath package FlexRiLoG (short for flexible and rigid labelings of graphs). Based on recent results the software generates motions of graphs using special edge colorings. The package computes and illustrates the colorings and the motions. We present the structure and usage of the package.

更新日期：2020-03-28
• arXiv.cs.MS Pub Date : 2019-12-03
Michael Riesch; Tien Dat Nguyen; Christian Jirauschek

Science depends heavily on reliable and easy-to-use software packages, such as mathematical libraries or data analysis tools. Developing such packages requires a lot of effort, which is too often avoided due to the lack of funding or recognition. In order to reduce the efforts required to create sustainable software packages, we present a project skeleton that ensures the best software engineering

更新日期：2020-03-28
• arXiv.cs.MS Pub Date : 2020-03-22
Leonid B. Sokolinsky; Irina M. Sokolinskaya

In this paper, a scalable iterative projection-type algorithm for solving non-stationary sys-tems of linear inequalities is considered. A non-stationary system is understood as a large-scale system of inequalities in which coefficients and constant terms can change during the calculation process. The proposed parallel algorithm uses the concept of pseudo-projection which generalizes the notion of orthogonal

更新日期：2020-03-24
• arXiv.cs.MS Pub Date : 2020-03-17
Francesco Rizzi; Patrick J. Blonigan; Kevin T. Carlberg

This work introduces Pressio, an open-source project aimed at enabling leading-edge projection-based reduced order models (ROMs) for large-scale nonlinear dynamical systems in science and engineering. Pressio provides model-reduction methods that can reduce both the number of spatial and temporal degrees of freedom for any dynamical system expressible as a system of parameterized ordinary differential

更新日期：2020-03-18
• arXiv.cs.MS Pub Date : 2019-07-18
Lukas Einkemmer

In this paper, our goal is to efficiently solve the Vlasov equation on GPUs. A semi-Lagrangian discontinuous Galerkin scheme is used for the discretization. Such kinetic computations are extremely expensive due to the high-dimensional phase space. The SLDG code, which is publicly available under the MIT license abstracts the number of dimensions and uses a shared codebase for both GPU and CPU based

更新日期：2020-03-18
• arXiv.cs.MS Pub Date : 2020-03-13
Fredrik JohanssonLFANT

We present the Mathematical Functions Grimoire (FunGrim), a website and database of formulas and theorems for special functions. We also discuss the symbolic computation library used as the backend and main development tool for FunGrim, and the Grim formula language used in these projects to represent mathematical content semantically.

更新日期：2020-03-16
• arXiv.cs.MS Pub Date : 2020-03-13
Hessa Al-Thani; Jon Lee

We present an open-source R package (MESgenCov v 0.1.0) for temporally fitting multivariate precipitation chemistry data and extracting a covariance matrix for use in the MESP (maximum-entropy sampling problem). We provide multiple functionalities for modeling and model assessment. The package is tightly coupled with NADP/NTN (National Atmospheric Deposition Program / National Trends Network) data

更新日期：2020-03-16
• arXiv.cs.MS Pub Date : 2020-03-11
Pratik Nayak; Terry Cojean; Hartwig Anzt

With the commencement of the exascale computing era, we realize that the majority of the leadership supercomputers are heterogeneous and massively parallel even on a single node with multiple co-processors such as GPUs and multiple cores on each node. For example, ORNLs Summit accumulates six NVIDIA Tesla V100s and 42 core IBM Power9s on each node. Synchronizing across all these compute resources in

更新日期：2020-03-12
• arXiv.cs.MS Pub Date : 2020-03-09
Divyam Aggarwal; Dhish Kumar Saxena; Thomas Bäck; Michael Emmerich

Airline scheduling poses some of the most challenging problems in the entire Operations Research (OR) domain. In that, crew scheduling (CS) constitutes one of the most important and challenging planning activities. Notably, the crew operating cost is the second-largest component of an airline's total operating cost (after the fuel cost). Hence, its optimization promises enormous benefits, and even

更新日期：2020-03-10
• arXiv.cs.MS Pub Date : 2020-03-09
Ryan R. Curtin; Marcus Edel; Rahul Ganesh Prabhu; Suryoday Basak; Zhihao Lou; Conrad Sanderson

This report provides an introduction to the ensmallen numerical optimization library, as well as a deep dive into the technical details of how it works. The library provides a fast and flexible C++ framework for mathematical optimization of arbitrary user-supplied functions. A large set of pre-built optimizers is provided, including many variants of Stochastic Gradient Descent and Quasi-Newton optimizers

更新日期：2020-03-10
• arXiv.cs.MS Pub Date : 2020-03-06
Corey Schimpf; Brian Castellani

COMPLEX-IT is a case-based, mixed-methods platform for social inquiry into complex data/systems, designed to increase non-expert access to the tools of computational social science (i.e., cluster analysis, artificial intelligence, data visualization, data forecasting, and scenario simulation). In particular, COMPLEX-IT aids social inquiry though a heavy emphasis on learning about the complex data/system

更新日期：2020-03-09
• arXiv.cs.MS Pub Date : 2019-07-03
Sambit Panda; Satish Palaniappan; Junhao Xiong; Eric W. Bridgeford; Ronak Mehta; Cencheng Shen; Joshua T. Vogelstein

We introduce hyppo, a unified library for performing multivariate hypothesis testing, including independence, two-sample, and k-sample testing. While many multivariate independence tests have R packages available, the interfaces are inconsistent and most are not available in Python. hyppo includes many state of the art multivariate testing procedures. The package is easy-to-use and is flexible enough

更新日期：2020-03-09
• arXiv.cs.MS Pub Date : 2020-03-04
Peter Benner; Martin Köhler; Jens Saak

Matrix equations are omnipresent in (numerical) linear algebra and systems theory. Especially in model order reduction (MOR) they play a key role in many balancing based reduction methods for linear dynamical systems. When these systems arise from spatial discretizations of evolutionary partial differential equations, their coefficient matrices are typically becoming large and sparse. Moreover, the

更新日期：2020-03-05
• arXiv.cs.MS Pub Date : 2019-05-20
Anne Reinarz; Dominic E. Charrier; Michael Bader; Luke Bovard; Michael Dumbser; Kenneth Duru; Francesco Fambri; Alice-Agnes Gabriel; Jean-Mathieu Gallard; Sven Köppel; Lukas Krenz; Leonhard Rannabauer; Luciano Rezzolla; Philipp Samfass; Maurizio Tavelli; Tobias Weinzierl

ExaHyPE ("An Exascale Hyperbolic PDE Engine") is a software engine for solving systems of first-order hyperbolic partial differential equations (PDEs). Hyperbolic PDEs are typically derived from the conservation laws of physics and are useful in a wide range of application areas. Applications powered by ExaHyPE can be run on a student's laptop, but are also able to exploit thousands of processor cores

更新日期：2020-03-05
• arXiv.cs.MS Pub Date : 2020-02-29
Huckleberry Febbo; Paramsothy Jayakumar; Jeffrey L. Stein; Tulga Ersal

Current direct-collocation-based optimal control software is either easy to use or fast, but not both. This is a major limitation for users that are trying to formulate complex optimal control problems (OCPs) for use in on-line applications. This paper introduces NLOptControl, an open-source modeling language that allows users to both easily formulate and quickly solve nonlinear OCPs using direct-collocation

更新日期：2020-03-03
• arXiv.cs.MS Pub Date : 2020-02-28
Peter Benner; Steffen W. R. Werner

For an easy use of model order reduction techniques in applications, software solutions are needed. In this paper, we describe the MORLAB, Model Order Reduction LABoratory, toolbox as an efficient implementation of model reduction techniques for dense, medium-scale linear time-invariant systems. Giving an introduction to the underlying programming principles of the toolbox, we show the basic idea of

更新日期：2020-03-02
• arXiv.cs.MS Pub Date : 2019-05-02
Akshay Agrawal; Stephen Boyd

We present a composition rule involving quasiconvex functions that generalizes the classical composition rule for convex functions. This rule complements well-known rules for the curvature of quasiconvex functions under increasing functions and pointwise maximums. We refer to the class of optimization problems generated by these rules, along with a base set of quasiconvex and quasiconcave functions

更新日期：2020-03-02
• arXiv.cs.MS Pub Date : 2020-02-27
Markus FringsChair for Computational Analysis of Technical Systems, Aachen, Germany; Norbert HostersChair for Computational Analysis of Technical Systems, Aachen, Germany; Corinna MüllerChair for Computational Analysis of Technical Systems, Aachen, Germany; Max SpahnChair for Computational Analysis of Technical Systems, Aachen, Germany; Christoph SusenChair for Computational Analysis of Technical Systems

This paper provides the description of a novel, multi-purpose spline library. In accordance with the increasingly diverse modes of usage of splines, it is multi-purpose in the sense that it supports geometry representation, finite element analysis, and optimization. The library features reading and writing for various file formats and a wide range of spline manipulation algorithms. Further, a new efficient

更新日期：2020-02-28
• arXiv.cs.MS Pub Date : 2020-02-26
Stephane Breuils; Vincent Nozick; Akihiro Sugimoto

Studies on time and memory costs of products in geometric algebra have been limited to cases where multivectors with multiple grades have only non-zero elements. This allows to design efficient algorithms for a generic purpose; however, it does not reflect the practical usage of geometric algebra. Indeed, in applications related to geometry, multivectors are likely to be full homogeneous, having their

更新日期：2020-02-27
• arXiv.cs.MS Pub Date : 2020-02-26
J. Pizarroso; J. Portela; A. Muñoz

Neural networks are important tools for data-intensive analysis and are commonly applied to model non-linear relationships between dependent and independent variables. However, neural networks are usually seen as "black boxes" that offer minimal information about how the input variables are used to predict the response in a fitted model. This article describes the \pkg{NeuralSens} package that can

更新日期：2020-02-27
• arXiv.cs.MS Pub Date : 2018-06-19
Dominic E. Charrier; Benjamin Hazelwood; Tobias Weinzierl

High-order Discontinuous Galerkin (DG) methods promise to be an excellent discretisation paradigm for partial differential equation solvers by combining high arithmetic intensity with localised data access. They also facilitate dynamic adaptivity without the need for conformal meshes. A parallel evaluation of DG's weak formulation within a mesh traversal is non-trivial, as dependency graphs over dynamically

更新日期：2020-02-25
• arXiv.cs.MS Pub Date : 2020-02-19
Peter Munch; Katharina Kormann; Martin Kronbichler

This work presents the efficient, matrix-free finite-element library hyper.deal for solving partial differential equations in two to six dimensions with high-order discontinuous Galerkin methods. It builds upon the low-dimensional finite-element library deal.II to create complex low-dimensional meshes and to operate on them individually. These meshes are combined via a tensor product on the fly and

更新日期：2020-02-20
• arXiv.cs.MS Pub Date : 2019-05-08
Dominik Ernst; Georg Hager; Jonas Thies; Gerhard Wellein

General matrix-matrix multiplications with double-precision real and complex entries (DGEMM and ZGEMM) in vendor-supplied BLAS libraries are best optimized for square matrices but often show bad performance for tall & skinny matrices, which are much taller than wide. NVIDIA's current CUBLAS implementation delivers only a fraction of the potential performance as indicated by the roofline model in this

更新日期：2020-02-19
• arXiv.cs.MS Pub Date : 2019-08-11
Eleftherios Avramidis; Marta Lalik; Ozgur E. Akman

Stochastic differential equations (SDEs) are widely used to model systems affected by random processes. In general, the analysis of an SDE model requires numerical solutions to be generated many times over multiple parameter combinations. However, this process often requires considerable computational resources to be practicable. Due to the embarrassingly parallel nature of the task, devices such as

更新日期：2020-02-19
• arXiv.cs.MS Pub Date : 2020-02-17
Nathan Heavner; Per-Gunnar Martinsson; Gregorio Quintana-Ortí

This paper describes efficient algorithms for computing rank-revealing factorizations of matrices that are too large to fit in RAM, and must instead be stored on slow external memory devices such as solid-state or spinning disk hard drives (out-of-core or out-of-memory). Traditional algorithms for computing rank revealing factorizations, such as the column pivoted QR factorization, or techniques for

更新日期：2020-02-18
• arXiv.cs.MS Pub Date : 2019-10-24
Daniel Arndt; Wolfgang Bangerth; Denis Davydov; Timo Heister; Luca Heltai; Martin Kronbichler; Matthias Maier; Jean-Paul Pelteret; Bruno Turcksin; David Wells

deal.II is a state-of-the-art finite element library focused on generality, dimension-independent programming, parallelism, and extensibility. Herein, we outline its primary design considerations and its sophisticated features such as distributed meshes, $hp$-adaptivity, support for complex geometries, and matrix-free algorithms. But deal.II is more than just a software library: It is also a diverse

更新日期：2020-02-18
• arXiv.cs.MS Pub Date : 2019-12-18
Patrick E. Farrell; Matthew G. Knepley; Lawrence Mitchell; Florian Wechsung

Effective relaxation methods are necessary for good multigrid convergence. For many equations, standard Jacobi and Gau{\ss}-Seidel are inadequate, and more sophisticated space decompositions are required; examples include problems with semidefinite terms or saddle point structure. In this paper we present a unifying software abstraction, PCPATCH, for the topological construction of space decompositions

更新日期：2020-02-18
• arXiv.cs.MS Pub Date : 2020-02-12
Eric Polizzi

The FEAST library package represents an unified framework for solving various family of eigenvalue problems and achieving accuracy, robustness, high-performance and scalability on parallel architectures. Its originality lies with a new transformative numerical approach to the traditional eigenvalue algorithm design - the FEAST algorithm. The algorithm gathers key elements from complex analysis, numerical

更新日期：2020-02-13
• arXiv.cs.MS Pub Date : 2020-02-12
Katja Bercic; Jacques Carette; William M. Farmer; Michael Kohlhase; Dennis Müller; Florian Rabe; Yasmine Sharoda

Mathematical software systems are becoming more and more important in pure and applied mathematics in order to deal with the complexity and scalability issues inherent in mathematics. In the last decades we have seen a cambric explosion of increasingly powerful but also diverging systems. To give researchers a guide to this space of systems, we devise a novel conceptualization of mathematical software

更新日期：2020-02-13
• arXiv.cs.MS Pub Date : 2020-02-12
Mirko Myllykoski; Carl Christian Kjelgaard Mikkelsen

In this paper, we present the StarNEig library for solving dense nonsymmetric standard and generalized eigenvalue problems. The library is built on top of the StarPU runtime system and targets both shared and distributed memory machines. Some components of the library have support for GPU acceleration. The library is currently in an early beta state and supports only real matrices. Support for complex

更新日期：2020-02-13
• arXiv.cs.MS Pub Date : 2020-01-22
Julian Blank; Kalyanmoy Deb

Python has become the programming language of choice for research and industry projects related to data science, machine learning, and deep learning. Since optimization is an inherent part of these research fields, more optimization related frameworks have arisen in the past few years. Only a few of them support optimization of multiple conflicting objectives at a time, but do not provide comprehensive

更新日期：2020-02-12
• arXiv.cs.MS Pub Date : 2020-02-09
Tianjian Lu; Yi-Fan Chen; Blake Hechtman; Tao Wang; John Anderson

In this work, we present a parallel algorithm for large-scale discrete Fourier transform (DFT) on Tensor Processing Unit (TPU) clusters. The algorithm is implemented in TensorFlow because of its rich set of functionalities for scientific computing and simplicity in realizing parallel computing algorithms. The DFT formulation is based on matrix multiplications between the input data and the Vandermonde

更新日期：2020-02-11
• arXiv.cs.MS Pub Date : 2020-02-09
Yang Liu; Xin Xing; Han Guo; Eric Michielssen; Pieter Ghysels; Xiaoye Sherry Li

This paper presents an adaptive randomized algorithm for computing the butterfly factorization of a $m\times n$ matrix with $m\approx n$ provided that both the matrix and its transpose can be rapidly applied to arbitrary vectors. The resulting factorization is composed of $O(\log n)$ sparse factors, each containing $O(n)$ nonzero entries. The factorization can be attained using $O(n^{3/2}\log n)$ computation

更新日期：2020-02-11
• arXiv.cs.MS Pub Date : 2018-07-09
Fabio Luporini; Michael Lange; Mathias Louboutin; Navjot Kukreja; Jan Hückelheim; Charles Yount; Philipp Witte; Paul H. J. Kelly; Felix J. Herrmann; Gerard J. Gorman

Stencil computations are a key part of many high-performance computing applications, such as image processing, convolutional neural networks, and finite-difference solvers for partial differential equations. Devito is a framework capable of generating highly-optimized code given symbolic equations expressed in Python, specialized in, but not limited to, affine (stencil) codes. The lowering process---from

更新日期：2020-02-10
• arXiv.cs.MS Pub Date : 2020-01-31
John Maclean; J. E. Bunder; A. J. Roberts

The `equation-free toolbox' empowers the computer-assisted analysis of complex, multiscale systems. Its aim is to enable you to immediately use microscopic simulators to perform macro-scale system level tasks and analysis, because micro-scale simulations are often the best available description of a system. The methodology bypasses the derivation of macroscopic evolution equations by computing the

更新日期：2020-02-06
• arXiv.cs.MS Pub Date : 2020-01-31
Martin Bauer; Harald Köstler; Ulrich Rüde

Lattice Boltzmann methods are a popular mesoscopic alternative to macroscopic computational fluid dynamics solvers. Many variants have been developed that vary in complexity, accuracy, and computational cost. Extensions are available to simulate multi-phase, multi-component, turbulent, or non-Newtonian flows. In this work we present lbmpy, a code generation package that supports a wide variety of different

更新日期：2020-02-03
• arXiv.cs.MS Pub Date : 2020-01-23
Zenan Huo; Gang Mei; Nengxiong Xu

The Smoothed Finite Element Method (S-FEM) proposed by Liu G.R. can achieve more accurate results than the conventional FEM. Currently, much commercial software and many open-source packages have been developed to analyze various science and engineering problems using the FEM. However, there is little work focusing on designing and developing software or packages for the S-FEM. In this paper, we design

更新日期：2020-01-27
• arXiv.cs.MS Pub Date : 2018-09-16
Egon Balas; Thiago Serra

In this paper, we present a method to determine if a lift-and-project cut for a mixed-integer linear program is irregular, in which case the cut is not equivalent to any intersection cut from the bases of the linear relaxation. This is an important question due to the intense research activity for the past decade on cuts from multiple rows of simplex tableau as well as on lift-and-project cuts from

更新日期：2020-01-27
• arXiv.cs.MS Pub Date : 2019-04-25
Michael Hopkins; Mantas Mikaitis; Dave R. Lester; Steve Furber

Although double-precision floating-point arithmetic currently dominates high-performance computing, there is increasing interest in smaller and simpler arithmetic types. The main reasons are potential improvements in energy efficiency and memory footprint and bandwidth. However, simply switching to lower-precision types typically results in increased numerical errors. We investigate approaches to improving

更新日期：2020-01-23
• arXiv.cs.MS Pub Date : 2019-06-25
Wayne B. Mitchell; Robert Strzodka; Robert D. Falgout

Algebraic multigrid (AMG) is a widely used scalable solver and preconditioner for large-scale linear systems resulting from the discretization of a wide class of elliptic PDEs. While AMG has optimal computational complexity, the cost of communication has become a significant bottleneck that limits its scalability as processor counts continue to grow on modern machines. This paper examines the design

更新日期：2020-01-22
• arXiv.cs.MS Pub Date : 2019-04-23
Francesco Torsello

We present $\mathtt{bimEX}$, a Mathematica package for exact computations in 3$+$1 bimetric relativity. It is based on the $\mathtt{xAct}$ bundle, which can handle computations involving both abstract tensors and their components. In this communication, we refer to the latter case as concrete computations. The package consists of two main parts. The first part involves the abstract tensors, and focuses

更新日期：2020-01-16
• arXiv.cs.MS Pub Date : 2020-01-08
Pascal Fua; Krzysztof Lis

Python currently is the dominant language in the field of Machine Learning but is often criticized for being slow to perform certain tasks. In this report, we use the well-known $N$-queens puzzle as a benchmark to show that once compiled using the Numba compiler it becomes competitive with C++ and Go in terms of execution speed while still allowing for very fast prototyping. This is true of both sequential

更新日期：2020-01-09
• arXiv.cs.MS Pub Date : 2020-01-08
Stephen Chou; Fredrik Kjolstad; Saman Amarasinghe

This paper shows how to generate code that efficiently converts sparse tensors between disparate storage formats (data layouts) like CSR, DIA, ELL, and many others. We decompose sparse tensor conversion into three logical phases: coordinate remapping, analysis, and assembly. We then develop a language that precisely describes how different formats group together and order a tensor's nonzeros in memory

更新日期：2020-01-09
• arXiv.cs.MS Pub Date : 2020-01-06
Mantas Mikaitis

We describe various issues caused by the lack of rounding in the gcc compiler implementation of the fixed-point arithmetic data types and operations. We demonstrate that there is no rounding in the conversion of constants, conversion from one numerical type to a less precise type and results of multiplications. Furthermore, we show that mixed-precision operations of fixed-point arithmetic lose precision

更新日期：2020-01-07
• arXiv.cs.MS Pub Date : 2020-01-01
Sheng-Chun Yang; Yong-Lei Wang

Nonequispaced discrete Fourier transformation (NDFT) is widely applied in all aspects of computational science and engineering. The computational efficiency and accuracy of NDFT has always been a critical issue in hindering its comprehensive applications both in intensive and in extensive aspects of scientific computing. In our previous work (2018, S.-C. Yang et al., Appl. Comput. Harmon. Anal. 44

更新日期：2020-01-07
• arXiv.cs.MS Pub Date : 2019-12-28
Ryan Senanayake; Fredrik Kjolstad; Changwan Hong; Shoaib Kamil; Saman Amarasinghe

We address the problem of optimizing mixed sparse and dense tensor algebra in a compiler. We show that standard loop transformations, such as strip-mining, tiling, collapsing, parallelization and vectorization, can be applied to irregular loops over sparse iteration spaces. We also show how these transformations can be applied to the contiguous value arrays of sparse tensor data structures, which we

更新日期：2020-01-04
Contents have been reproduced by permission of the publishers.

down
wechat
bug