当前期刊: arXiv - CS - Distributed, Parallel, and Cluster Computing Go to current issue    加入关注   
显示样式:        排序: IF: - GO 导出
我的关注
我的收藏
您暂时未登录!
登录
  • Characterization and Derivation of Heard-Of Predicates for Asynchronous Message-Passing Models
    arXiv.cs.DC Pub Date : 2020-11-25
    Adam Shimi; Aurélie Hurault; Philippe Queinnec

    Message-passing models of distributed computing vary along numerous dimensions: degree of synchrony, kind of faults, number of faults... One way to deal with this variety is by restricting communication to rounds. This is the setting of the Heard-Of model, which captures many models through predicates on the messages sent in a round and received on time, at this round or before on the receivers. Yet

    更新日期:2020-11-27
  • Rapid Exploration of Optimization Strategies on Advanced Architectures using TestSNAP and LAMMPS
    arXiv.cs.DC Pub Date : 2020-11-25
    Rahulkumar Gayatri; Stan Moore; Evan Weinberg; Nicholas Lubbers; Sarah Anderson; Jack Deslippe; Danny Perez; Aidan P. Thompson

    The exascale race is at an end with the announcement of the Aurora and Frontier machines. This next generation of supercomputers utilize diverse hardware architectures to achieve their compute performance, providing an added onus on the performance portability of applications. An expanding fragmentation of programming models would provide a compounding optimization challenge were it not for the evolution

    更新日期:2020-11-27
  • Optimizing Resource-Efficiency for Federated Edge Intelligence in IoT Networks
    arXiv.cs.DC Pub Date : 2020-11-25
    Yong Xiao; Yingyu Li; Guangming Shi; H. Vincent Poor

    This paper studies an edge intelligence-based IoT network in which a set of edge servers learn a shared model using federated learning (FL) based on the datasets uploaded from a multi-technology-supported IoT network. The data uploading performance of IoT network and the computational capacity of edge servers are entangled with each other in influencing the FL model training process. We propose a novel

    更新日期:2020-11-27
  • An $O(\log^{3/2}n)$ Parallel Time Population Protocol for Majority with $O(\log n)$ States
    arXiv.cs.DC Pub Date : 2020-11-25
    Stav Ben-Nun; Tsvi Kopelowitz; Matan Kraus; Ely Porat

    In population protocols, the underlying distributed network consists of $n$ nodes (or agents), denoted by $V$, and a scheduler that continuously selects uniformly random pairs of nodes to interact. When two nodes interact, their states are updated by applying a state transition function that depends only on the states of the two nodes prior to the interaction. The efficiency of a population protocol

    更新日期:2020-11-27
  • Proposal of Automatic Offloading Method in Mixed Offloading Destination Environment
    arXiv.cs.DC Pub Date : 2020-11-24
    Yoji Yamato

    When using heterogeneous hardware, barriers of technical skills such as OpenMP, CUDA and OpenCL are high. Based on that, I have proposed environment-adaptive software that enables automatic conversion, configuration. However, including existing technologies, there has been no research to properly and automatically offload the mixed offloading destination environment such as GPU, FPGA and many core

    更新日期:2020-11-27
  • Distributed Reinforcement Learning is a Dataflow Problem
    arXiv.cs.DC Pub Date : 2020-11-25
    Eric Liang; Zhanghao Wu; Michael Luo; Sven Mika; Ion Stoica

    Researchers and practitioners in the field of reinforcement learning (RL) frequently leverage parallel computation, which has led to a plethora of new algorithms and systems in the last few years. In this paper, we re-examine the challenges posed by distributed RL and try to view it through the lens of an old idea: distributed dataflow. We show that viewing RL as a dataflow problem leads to highly

    更新日期:2020-11-27
  • Distributed Additive Encryption and Quantization for Privacy Preserving Federated Deep Learning
    arXiv.cs.DC Pub Date : 2020-11-25
    Hangyu Zhu; Rui Wang; Yaochu Jin; Kaitai Liang; Jianting Ning

    Homomorphic encryption is a very useful gradient protection technique used in privacy preserving federated learning. However, existing encrypted federated learning systems need a trusted third party to generate and distribute key pairs to connected participants, making them unsuited for federated learning and vulnerable to security risks. Moreover, encrypting all model parameters is computationally

    更新日期:2020-11-27
  • On the Serverless Nature of Blockchains and Smart Contracts
    arXiv.cs.DC Pub Date : 2020-11-24
    Vladimir Yussupov; Ghareeb Falazi; Uwe Breitenbücher; Frank Leymann

    Although historically the term serverless was also used in the context of peer-to-peer systems, it is more frequently associated with the architectural style for developing cloud-native applications. From the developer's perspective, serverless architectures allow reducing management efforts since applications are composed using provider-managed components, e.g., Database-as-a-Service (DBaaS) and

    更新日期:2020-11-27
  • MetaGater: Fast Learning of Conditional Channel Gated Networks via Federated Meta-Learning
    arXiv.cs.DC Pub Date : 2020-11-25
    Sen Lin; Li Yang; Zhezhi He; Deliang Fan; Junshan Zhang

    While deep learning has achieved phenomenal successes in many AI applications, its enormous model size and intensive computation requirements pose a formidable challenge to the deployment in resource-limited nodes. There has recently been an increasing interest in computationally-efficient learning methods, e.g., quantization, pruning and channel gating. However, most existing techniques cannot adapt

    更新日期:2020-11-27
  • The Chunks and Tasks Matrix Library 2.0
    arXiv.cs.DC Pub Date : 2020-11-23
    Emanuel H. Rubensson; Elias Rudberg; Anastasia Kruchinina; Anton G. Artemov

    We present a C++ header-only parallel sparse matrix library, based on sparse quadtree representation of matrices using the Chunks and Tasks programming model. The library implements a number of sparse matrix algorithms for distributed memory parallelization that are able to dynamically exploit data locality to avoid movement of data. This is demonstrated for the example of block-sparse matrix-matrix

    更新日期:2020-11-25
  • The Bloom Clock for Causality Testing
    arXiv.cs.DC Pub Date : 2020-11-23
    Anshuman Misra; Ajay D. Kshemkalyani

    Testing for causality between events in distributed executions is a fundamental problem. Vector clocks solve this problem but do not scale well. The probabilistic Bloom clock can determine causality between events with lower space, time, and message-space overhead than vector clock; however, predictions suffer from false positives. We give the protocol for the Bloom clock based on Counting Bloom filters

    更新日期:2020-11-25
  • Cost- and QoS-Efficient Serverless Cloud Computing
    arXiv.cs.DC Pub Date : 2020-11-23
    Chavit Denninnart

    Cloud-based serverless computing systems, either public or privately provisioned, aim to provide the illusion of infinite resources and abstract users from details of the allocation decisions. With the goal of providing a low cost and a high QoS, the serverless computing paradigm offers opportunities that can be harnessed to attain the goals. Specifically, our strategy in this dissertation is to avoid

    更新日期:2020-11-25
  • An elastic framework for ensemble-based large-scale data assimilation
    arXiv.cs.DC Pub Date : 2020-11-21
    Sebastian Friedemann; Bruno Raffin

    Prediction of chaotic systems relies on a floating fusion of sensor data (observations) with a numerical model to decide on a good system trajectory and to compensate nonlinear feedback effects. Ensemble-based data assimilation (DA) is a major method for this concern depending on propagating an ensemble of perturbed model realizations.In this paper we develop an elastic, online, fault-tolerant and

    更新日期:2020-11-25
  • Managing Latency in Edge-Cloud Environment
    arXiv.cs.DC Pub Date : 2020-11-23
    Lubomír Bulej; Tomáš Bureš; Adam Filandr; Petr Hnětynka; Iveta Hnětynkova; Jan Pacovský; Gabor Sandor; Ilias Gerostathopoulos

    Modern Cyber-physical Systems (CPS) include applications like smart traffic, smart agriculture, smart power grid, etc. Commonly, these systems are distributed and composed of end-user applications and microservices that typically run in the cloud. The connection with the physical world, which is inherent to CPS, brings the need to operate and respond in real-time. As the cloud becomes part of the computation

    更新日期:2020-11-25
  • Distributed algorithms to determine eigenvectors of matrices on spatially distributed networks
    arXiv.cs.DC Pub Date : 2020-11-23
    Nazar Emirov; Cheng Cheng; Qiyu Sun; Zhihua Qu

    Eigenvectors of matrices on a network have been used for understanding spectral clustering and influence of a vertex. For matrices with small geodesic-width, we propose a distributed iterative algorithm in this letter to find eigenvectors associated with their given eigenvalues. We also consider the implementation of the proposed algorithm at the vertex/agent level in a spatially distributed network

    更新日期:2020-11-25
  • Peer-to-Peer Energy Systems for Connected Communities: A Review of Recent Advances and Emerging Challenges
    arXiv.cs.DC Pub Date : 2020-11-22
    Wayes Tushar; Chau Yuen; Tapan Saha; Thomas Morstyn; Archie Chapman; M. Jan E Alam; Sarmad Hanif; H. Vincent Poor

    After a century of relative stability of the electricity industry, extensive deployment of distributed energy resources and recent advances in computation and communication technologies have changed the nature of how we consume, trade, and apply energy. The power system is facing a transition from its traditional hierarchical structure to a more deregulated model by introducing new energy distribution

    更新日期:2020-11-25
  • Massively Parallel Causal Inference of Whole Brain Dynamics at Single Neuron Resolution
    arXiv.cs.DC Pub Date : 2020-11-22
    Wassapon WatanakeesuntornNara Institute of Science and Technology, Nara, Japan; Keichi TakahashiNara Institute of Science and Technology, Nara, Japan; Kohei IchikawaNara Institute of Science and Technology, Nara, Japan; Joseph ParkU.S. Department of the Interior, Florida, USA; George SugiharaUniversity of California San Diego, California, USA; Ryousei TakanoNational Institute of Advanced Industrial

    Empirical Dynamic Modeling (EDM) is a nonlinear time series causal inference framework. The latest implementation of EDM, cppEDM, has only been used for small datasets due to computational cost. With the growth of data collection capabilities, there is a great need to identify causal relationships in large datasets. We present mpEDM, a parallel distributed implementation of EDM optimized for modern

    更新日期:2020-11-25
  • HALO 1.0: A Hardware-agnostic Accelerator Orchestration Framework for Enabling Hardware-agnostic Programming with True Performance Portability for Heterogeneous HPC
    arXiv.cs.DC Pub Date : 2020-11-22
    Michael Riera; Erfan Bank Tavakoli; Masudul Hassan Quraishi; Fengbo Ren

    Hardware-agnostic programming with high performance portability will be the bedrock for realizing the ubiquitous adoption of emerging accelerator technologies in future heterogeneous high-performance computing (HPC) systems, which is the key to achieving the next level of HPC performance on an expanding accelerator landscape. In this paper, we present HALO 1.0, an open-ended extensible multi-agent

    更新日期:2020-11-25
  • Wireless Distributed Edge Learning: How Many Edge Devices Do We Need?
    arXiv.cs.DC Pub Date : 2020-11-22
    Jaeyoung Song; Marios Kountouris

    We consider distributed machine learning at the wireless edge, where a parameter server builds a global model with the help of multiple wireless edge devices that perform computations on local dataset partitions. Edge devices transmit the result of their computations (updates of current global model) to the server using a fixed rate and orthogonal multiple access over an error prone wireless channel

    更新日期:2020-11-25
  • Study of Resource Amount Configuration for Automatic Application Offloading
    arXiv.cs.DC Pub Date : 2020-11-20
    Yoji Yamato

    In recent years, utilization of heterogeneous hardware other than small core CPU such as GPU, FPGA or many core CPU is increasing. However, when using heterogeneous hardware, barriers of technical skills such as OpenMP, CUDA and OpenCL are high. Based on that, I have proposed environment-adaptive software that enables automatic conversion, configuration, and high performance operation of once written

    更新日期:2020-11-25
  • A Game-Theoretic Analysis of Cross-Chain Atomic Swaps with HTLCs
    arXiv.cs.DC Pub Date : 2020-11-23
    Jiahua Xu; Damien Ackerer; Alevtina Dubovitskaya

    With the increasing adoption of blockchain technology, there is a strong need for achieving interoperability between unconnected ledgers. Approaches such as hash time lock contracts (HTLCs) have arisen for cross-chain asset exchange. The solution embraces the likelihood of transaction failure and attempts to "make the best out of worst" by allowing transacting agents to at least keep their original

    更新日期:2020-11-25
  • Federated learning with class imbalance reduction
    arXiv.cs.DC Pub Date : 2020-11-23
    Miao Yang; Akitanoshou Wong; Hongbin Zhu; Haifeng Wang; Hua Qian

    Federated learning (FL) is a promising technique that enables a large amount of edge computing devices to collaboratively train a global learning model. Due to privacy concerns, the raw data on devices could not be available for centralized server. Constrained by the spectrum limitation and computation capacity, only a subset of devices can be engaged to train and transmit the trained model to centralized

    更新日期:2020-11-25
  • LINDT: Tackling Negative Federated Learning with Local Adaptation
    arXiv.cs.DC Pub Date : 2020-11-23
    Hong Lin; Lidan Shou; Ke Chen; Gang Chen; Sai Wu

    Federated Learning (FL) is a promising distributed learning paradigm, which allows a number of data owners (also called clients) to collaboratively learn a shared model without disclosing each client's data. However, FL may fail to proceed properly, amid a state that we call negative federated learning (NFL). This paper addresses the problem of negative federated learning. We formulate a rigorous definition

    更新日期:2020-11-25
  • TaiJi: Longest Chain Availability with BFT Fast Confirmation
    arXiv.cs.DC Pub Date : 2020-11-22
    Songze Li; David Tse

    Most state machine replication protocols are either based on the 40-years-old Byzantine Fault Tolerance (BFT) theory or the more recent Nakamoto's longest chain design. Longest chain protocols, designed originally in the Proof-of-Work (PoW) setting, are available under dynamic participation, but has probabilistic confirmation with long latency dependent on the security parameter. BFT protocols, designed

    更新日期:2020-11-25
  • Distributed Deep Reinforcement Learning: An Overview
    arXiv.cs.DC Pub Date : 2020-11-22
    Mohammad Reza Samsami; Hossein Alimadad

    Deep reinforcement learning (DRL) is a very active research area. However, several technical and scientific issues require to be addressed, amongst which we can mention data inefficiency, exploration-exploitation trade-off, and multi-task learning. Therefore, distributed modifications of DRL were introduced; agents that could be run on many machines simultaneously. In this article, we provide a survey

    更新日期:2020-11-25
  • A decentralized aggregation mechanism for training deep learning models using smart contract system for bank loan prediction
    arXiv.cs.DC Pub Date : 2020-11-22
    Pratik Ratadiya; Khushi Asawa; Omkar Nikhal

    Data privacy and sharing has always been a critical issue when trying to build complex deep learning-based systems to model data. Facilitation of a decentralized approach that could take benefit from data across multiple nodes while not needing to merge their data contents physically has been an area of active research. In this paper, we present a solution to benefit from a distributed data setup in

    更新日期:2020-11-25
  • On the Benefits of Multiple Gossip Steps in Communication-Constrained Decentralized Optimization
    arXiv.cs.DC Pub Date : 2020-11-20
    Abolfazl Hashemi; Anish Acharya; Rudrajit Das; Haris Vikalo; Sujay Sanghavi; Inderjit Dhillon

    In decentralized optimization, it is common algorithmic practice to have nodes interleave (local) gradient descent iterations with gossip (i.e. averaging over the network) steps. Motivated by the training of large-scale machine learning models, it is also increasingly common to require that messages be {\em lossy compressed} versions of the local parameters. In this paper, we show that, in such compressed

    更新日期:2020-11-25
  • Locally Solvable Tasks and the Limitations of Valency Arguments
    arXiv.cs.DC Pub Date : 2020-11-20
    Hagit Attiya; Armando Castañeda; Sergio Rajsbaum

    An elegant strategy for proving impossibility results in distributed computing was introduced in the celebrated FLP consensus impossibility proof. This strategy is local in nature as at each stage, one configuration of a hypothetical protocol for consensus is considered, together with future valencies of possible extensions. This proof strategy has been used in numerous situations related to consensus

    更新日期:2020-11-23
  • An Efficient End-to-End Deep Learning Training Framework via Fine-Grained Pattern-Based Pruning
    arXiv.cs.DC Pub Date : 2020-11-20
    Chengming Zhang; Geng Yuan; Wei Niu; Jiannan Tian; Sian Jin; Donglin Zhuang; Zhe Jiang; Yanzhi Wang; Bin Ren; Shuaiwen Leon Song; Dingwen Tao

    Convolutional neural networks (CNNs) are becoming increasingly deeper, wider, and non-linear because of the growing demand on prediction accuracy and analysis quality. The wide and deep CNNs, however, require a large amount of computing resources and processing time. Many previous works have studied model pruning to improve inference performance, but little work has been done for effectively reducing

    更新日期:2020-11-23
  • heSRPT: Parallel Scheduling to Minimize Mean Slowdown
    arXiv.cs.DC Pub Date : 2020-11-18
    Benjamin Berg; Rein Vesilo; Mor Harchol-Balter

    Modern data centers serve workloads which are capable of exploiting parallelism. When a job parallelizes across multiple servers it will complete more quickly, but jobs receive diminishing returns from being allocated additional servers. Because allocating multiple servers to a single job is inefficient, it is unclear how best to allocate a fixed number of servers between many parallelizable jobs.

    更新日期:2020-11-21
  • Approximate Bipartite Vertex Cover in the CONGEST Model
    arXiv.cs.DC Pub Date : 2020-11-19
    Salwa Faour; Fabian Kuhn

    We give efficient distributed algorithms for the minimum vertex cover problem in bipartite graphs in the CONGEST model. From K\H{o}nig's theorem, it is well known that in bipartite graphs the size of a minimum vertex cover is equal to the size of a maximum matching. We first show that together with an existing $O(n\log n)$-round algorithm for computing a maximum matching, the constructive proof of

    更新日期:2020-11-21
  • Budgeted Online Selection of Candidate IoT Clients to Participate in Federated Learning
    arXiv.cs.DC Pub Date : 2020-11-16
    Ihab Mohammed; Shadha Tabatabai; Ala Al-Fuqaha; Faissal El Bouanani; Junaid Qadir; Basheer Qolomany; Mohsen Guizani

    Machine Learning (ML), and Deep Learning (DL) in particular, play a vital role in providing smart services to the industry. These techniques however suffer from privacy and security concerns since data is collected from clients and then stored and processed at a central location. Federated Learning (FL), an architecture in which model parameters are exchanged instead of client data, has been proposed

    更新日期:2020-11-21
  • Checking Causal Consistency of Distributed Databases
    arXiv.cs.DC Pub Date : 2020-11-19
    Rachid Zennou; Ranadeep Biswas; Ahmed Bouajjani; Constantin Enea; Mohammed Erradi

    The CAP Theorem shows that (strong) Consistency, Availability, and Partition tolerance are impossible to be ensured together. Causal consistency is one of the weak consistency models that can be implemented to ensure availability and partition tolerance in distributed systems. In this work, we propose a tool to check automatically the conformance of distributed/concurrent systems executions to causal

    更新日期:2020-11-21
  • FedEval: A Benchmark System with a Comprehensive Evaluation Model for Federated Learning
    arXiv.cs.DC Pub Date : 2020-11-19
    Di Chai; Leye Wang; Kai Chen; Qiang Yang

    As an innovative solution for privacy-preserving machine learning (ML), federated learning (FL) is attracting much attention from research and industry areas. While new technologies proposed in the past few years do evolve the FL area, unfortunately, the evaluation results presented in these works fall short in integrity and are hardly comparable because of the inconsistent evaluation metrics and the

    更新日期:2020-11-21
  • High-Throughput and Memory-Efficient Parallel Viterbi Decoder for Convolutional Codes on GPU
    arXiv.cs.DC Pub Date : 2020-11-18
    Alireza Mohammadidoost; Matin Hashemi

    This paper describes a parallel implementation of Viterbi decoding algorithm. Viterbi decoder is widely used in many state-of-the-art wireless systems. The proposed solution optimizes both throughput and memory usage by applying optimizations such as unified kernel implementation and parallel traceback. Experimental evaluations show that the proposed solution achieves higher throughput compared to

    更新日期:2020-11-19
  • Whale: A Unified Distributed Training Framework
    arXiv.cs.DC Pub Date : 2020-11-18
    Ang Wang; Xianyan Jia; Le Jiang; Jie Zhang; Yong Li; Wei Lin

    Data parallelism (DP) has been a common practice to speed up the training workloads for a long time. However, with the increase of data size and model size, DP has become less optimal for most distributed training workloads. Moreover, it does not work on models whose parameter size cannot fit into a single GPU's device memory. To enable and further improve the industrial-level giant model training

    更新日期:2020-11-19
  • A Novel Memory-Efficient Deep Learning Training Framework via Error-Bounded Lossy Compression
    arXiv.cs.DC Pub Date : 2020-11-18
    Sian Jin; Guanpeng Li; Shuaiwen Leon Song; Dingwen Tao

    Deep neural networks (DNNs) are becoming increasingly deeper, wider, and non-linear due to the growing demands on prediction accuracy and analysis quality. When training a DNN model, the intermediate activation data must be saved in the memory during forward propagation and then restored for backward propagation. However, state-of-the-art accelerators such as GPUs are only equipped with very limited

    更新日期:2020-11-19
  • FLaaS: Federated Learning as a Service
    arXiv.cs.DC Pub Date : 2020-11-18
    Nicolas Kourtellis; Kleomenis Katevas; Diego Perino

    Federated Learning (FL) is emerging as a promising technology to build machine learning models in a decentralized, privacy-preserving fashion. Indeed, FL enables local training on user devices, avoiding user data to be transferred to centralized servers, and can be enhanced with differential privacy mechanisms. Although FL has been recently deployed in real systems, the possibility of collaborative

    更新日期:2020-11-19
  • Ginkgo -- A Math Library designed for Platform Portability
    arXiv.cs.DC Pub Date : 2020-11-17
    Terry Cojean; Yu-Hsiang "Mike" Tsai; Hartwig Anzt

    The first associations to software sustainability might be the existence of a continuous integration (CI) framework; the existence of a testing framework composed of unit tests, integration tests, and end-to-end tests; and also the existence of software documentation. However, when asking what is a common deathblow for a scientific software product, it is often the lack of platform and performance

    更新日期:2020-11-19
  • On the Feasibility and Enhancement of the Tuple Space Explosion Attack against Open vSwitch
    arXiv.cs.DC Pub Date : 2020-11-18
    Levente Csikor; Vipul Ujawane; Dinil Mon Divakaran

    Being a crucial part of networked systems, packet classification has to be highly efficient; however, software switches in cloud environments still face performance challenges. The recently proposed Tuple Space Explosion (TSE) attack exploits an algorithmic deficiency in Open vSwitch (OVS). In TSE, legitimate low-rate attack traffic makes the cardinal linear search algorithm in the Tuple Space Search

    更新日期:2020-11-19
  • A Survey of System Architectures and Techniques for FPGA Virtualization
    arXiv.cs.DC Pub Date : 2020-11-18
    Masudul Hassan Quraishi; Erfan Bank Tavakoli; Fengbo Ren

    FPGA accelerators are gaining increasing attention in both cloud and edge computing because of their hardware flexibility, high computational throughput, and low power consumption. However, the design flow of FPGAs often requires specific knowledge of the underlying hardware, which hinders the wide adoption of FPGAs by application developers. Therefore, the virtualization of FPGAs becomes extremely

    更新日期:2020-11-19
  • Distributed Injection-Locking in Analog Ising Machines to Solve Combinatorial Optimizations
    arXiv.cs.DC Pub Date : 2020-11-18
    M. Ali Vosoughi

    The oscillator-based Ising machine (OIM) is a network of coupled CMOS oscillators that solves combinatorial optimization problems. In this paper, the distribution of the injection-locking oscillations throughout the circuit is proposed to accelerate the phase-locking of the OIM. The implications of the proposed technique theoretically investigated and verified by extensive simulations in EDA tools

    更新日期:2020-11-19
  • Optimal Accuracy-Time Trade-off for Deep Learning Services in Edge Computing Systems
    arXiv.cs.DC Pub Date : 2020-11-17
    Minoo Hosseinzadeh; Andrew Wachal; Hana Khamfroush; Daniel E. Lucani

    With the increasing demand for computationally intensive services like deep learning tasks, emerging distributed computing platforms such as edge computing (EC) systems are becoming more popular. Edge computing systems have shown promising results in terms of latency reduction compared to the traditional cloud systems. However, their limited processing capacity imposes a trade-off between the potential

    更新日期:2020-11-18
  • GPURepair: Automated Repair of GPU Kernels
    arXiv.cs.DC Pub Date : 2020-11-17
    Saurabh Joshi; Gautam Muduganti

    This paper presents a tool for repairing errors in GPU kernels written in CUDA or OpenCL due to data races and barrier divergence. Our novel extension to prior work can also remove barriers that are deemed unnecessary for correctness. We implement these ideas in our tool called GPURepair, which uses GPUVerify as the verification oracle for GPU kernels. We also extend GPUVerify to support CUDA Cooperative

    更新日期:2020-11-18
  • Uniform Bipartition in the Population Protocol Model with Arbitrary Communication Graphs
    arXiv.cs.DC Pub Date : 2020-11-17
    Hiroto Yasumi; Fukuhito Ooshita; Michiko Inoue; Sébastien Tixeuil

    In this paper, we focus on the uniform bipartition problem in the population protocol model. This problem aims to divide a population into two groups of equal size. In particular, we consider the problem in the context of \emph{arbitrary} communication graphs. As a result, we clarify the solvability of the uniform bipartition problem with arbitrary communication graphs when agents in the population

    更新日期:2020-11-18
  • Heterogeneous Paxos: Technical Report
    arXiv.cs.DC Pub Date : 2020-11-16
    Isaac Sheff; Xinwen Wang; Robbert van Renesse; Andrew C. Myers

    In distributed systems, a group of $\textit{learners}$ achieve $\textit{consensus}$ when, by observing the output of some $\textit{acceptors}$, they all arrive at the same value. Consensus is crucial for ordering transactions in failure-tolerant systems. Traditional consensus algorithms are homogeneous in three ways: - all learners are treated equally, - all acceptors are treated equally, and - all

    更新日期:2020-11-18
  • Improved Load Balancing in Large Scale Systems using Attained Service Time Reporting
    arXiv.cs.DC Pub Date : 2020-11-16
    Tim Hellemans; Benny Van Houdt

    Our interest lies in load balancing jobs in large scale systems consisting of multiple dispatchers and FCFS servers. In the absence of any information on job sizes, dispatchers typically use queue length information reported by the servers to assign incoming jobs. When job sizes are highly variable, using only queue length information is clearly suboptimal and performance can be improved if some indication

    更新日期:2020-11-18
  • Stochastic Client Selection for Federated Learning with Volatile Clients
    arXiv.cs.DC Pub Date : 2020-11-17
    Tiansheng Huang; Weiwei Lin; Keqin Li; Albert Y. Zomaya

    Federated Learning (FL), arising as a novel secure learning paradigm, has received notable attention from the public. In each round of synchronous FL training, only a fraction of available clients are chosen to participate and the selection of which might have a direct or indirect effect on the training efficiency, as well as the final model performance. In this paper, we investigate the client selection

    更新日期:2020-11-18
  • FTK: A High-Dimensional Simplicial Meshing Framework for Robust and Scalable Feature Tracking
    arXiv.cs.DC Pub Date : 2020-11-17
    Hanqi Guo; David Lenz; Jiayi Xu; Xin Liang; Wenbin He; Iulian R. Grindeanu; Han-Wei Shen; Tom Peterka; Todd Munson; Ian Foster

    We present the Feature Tracking Kit (FTK), a framework that simplifies, scales, and delivers various feature-tracking algorithms for scientific data. The key of FTK is our high-dimensional simplicial meshing scheme that generalizes both regular and unstructured spatial meshes to spacetime while tessellating spacetime mesh elements into simplices. The benefits of using simplicial spacetime meshes include

    更新日期:2020-11-18
  • MobChain: Three-Way Collusion Resistance in Witness-Oriented Location Proof Systems Using Distributed Consensus
    arXiv.cs.DC Pub Date : 2020-11-17
    Faheem Zafar; Abid Khan; Saif Ur Rehman Malik; Adeel Anjum; Mansoor Ahmed

    Smart devices have accentuated the importance of geolocation information. Geolocation identification using smart devices has paved the path for incentive-based location-based services (LBS). A location proof is a digital certificate of the geographical location of a user, which can be used to access various LBS. However, a user full control over a device allows the tampering of location proof. Initially

    更新日期:2020-11-18
  • Federated Composite Optimization
    arXiv.cs.DC Pub Date : 2020-11-17
    Honglin Yuan; Manzil Zaheer; Sashank Reddi

    Federated Learning (FL) is a distributed learning paradigm which scales on-device learning collaboratively and privately. Standard FL algorithms such as Federated Averaging (FedAvg) are primarily geared towards smooth unconstrained settings. In this paper, we study the Federated Composite Optimization (FCO) problem, where the objective function in FL includes an additive (possibly) non-smooth component

    更新日期:2020-11-18
  • Optimizing Graph Processing and Preprocessing with Hardware Assisted Propagation Blocking
    arXiv.cs.DC Pub Date : 2020-11-17
    Vignesh Balaji; Brandon Lucia

    Extensive prior research has focused on alleviating the characteristic poor cache locality of graph analytics workloads. However, graph pre-processing tasks remain relatively unexplored. In many important scenarios, graph pre-processing tasks can be as expensive as the downstream graph analytics kernel. We observe that Propagation Blocking (PB), a software optimization designed for SpMV kernels, generalizes

    更新日期:2020-11-18
  • Revising the classic computing paradigm and its technological implementations
    arXiv.cs.DC Pub Date : 2020-11-16
    János Végh

    Today's computing is told to be based on the classic paradigm, proposed by von Neumann, a three-quarter century ago. However, that paradigm was justified (for the timing relations of) vacuum tubes only. The technological development invalidated the classic paradigm (but not the model!) and led to catastrophic performance losses in computing systems, from operating gate level to large networks, including

    更新日期:2020-11-18
  • Avoiding Communication in Logistic Regression
    arXiv.cs.DC Pub Date : 2020-11-16
    Aditya Devarakonda; James Demmel

    Stochastic gradient descent (SGD) is one of the most widely used optimization methods for solving various machine learning problems. SGD solves an optimization problem by iteratively sampling a few data points from the input data, computing gradients for the selected data points, and updating the solution. However, in a parallel setting, SGD requires interprocess communication at every iteration. We

    更新日期:2020-11-18
  • Towards Collaborative Optimization of Cluster Configurations for Distributed Dataflow Jobs
    arXiv.cs.DC Pub Date : 2020-11-16
    Jonathan Will; Jonathan Bader; Lauritz Thamsen

    Analyzing large datasets with distributed dataflow systems requires the use of clusters. Public cloud providers offer a large variety and quantity of resources that can be used for such clusters. However, picking the appropriate resources in both type and number can often be challenging, as the selected configuration needs to match a distributed dataflow job's resource demands and access patterns.

    更新日期:2020-11-17
  • Secured Distributed Algorithms without Hardness Assumptions
    arXiv.cs.DC Pub Date : 2020-11-16
    Leonid Barenboim; Harel Levin

    We study algorithms in the distributed message-passing model that produce secured output, for an input graph $G$. Specifically, each vertex computes its part in the output, the entire output is correct, but each vertex cannot discover the output of other vertices, with a certain probability. This is motivated by high-performance processors that are embedded nowadays in a large variety of devices. In

    更新日期:2020-11-17
  • Video Big Data Analytics in the Cloud: A Reference Architecture, Survey, Opportunities, and Open Research Issues
    arXiv.cs.DC Pub Date : 2020-11-16
    Aftab Alam; Irfan Ullah; Young-Koo Lee

    The proliferation of multimedia devices over the Internet of Things (IoT) generates an unprecedented amount of data. Consequently, the world has stepped into the era of big data. Recently, on the rise of distributed computing technologies, video big data analytics in the cloud has attracted the attention of researchers and practitioners. The current technology and market trends demand an efficient

    更新日期:2020-11-17
  • Crowdsharing Wireless Energy Services
    arXiv.cs.DC Pub Date : 2020-11-15
    Abdallah Lakhdari; Amani Abusafia; Athman Bouguettaya

    We propose a novel self-sustained ecosystem for energy sharing in the IoT environment. We leverage energy harvesting, wireless power transfer, and crowdsourcing that facilitate the development of an energy crowdsharing framework to charge IoT devices. The ubiquity of IoT devices, coupled with the potential ability for sharing energy, provides new and exciting opportunities to crowdsource wireless energy

    更新日期:2020-11-17
  • Recoverable, Abortable, and Adaptive Mutual Exclusion with Sublogarithmic RMR Complexity
    arXiv.cs.DC Pub Date : 2020-11-15
    Daniel Katzan; Adam Morrison

    We present the first recoverable mutual exclusion (RME) algorithm that is simultaneously abortable, adaptive to point contention, and with sublogarithmic RMR complexity. Our algorithm has $O(\min(K,\log_W N))$ RMR passage complexity and $O(F + \min(K,\log_W N))$ RMR super-passage complexity, where $K$ is the number of concurrent processes (point contention), $W$ is the size (in bits) of registers,

    更新日期:2020-11-17
  • Echo-CGC: A Communication-Efficient Byzantine-tolerant Distributed Machine Learning Algorithm in Single-Hop Radio Network
    arXiv.cs.DC Pub Date : 2020-11-15
    Qinzi Zhang; Lewis Tseng

    In this paper, we focus on a popular DML framework -- the parameter server computation paradigm and iterative learning algorithms that proceed in rounds. We aim to reduce the communication complexity of Byzantine-tolerant DML algorithms in the single-hop radio network. Inspired by the CGC filter developed by Gupta and Vaidya, PODC 2020, we propose a gradient descent-based algorithm, Echo-CGC. Our main

    更新日期:2020-11-17
Contents have been reproduced by permission of the publishers.
导出
全部期刊列表>>
施普林格,自然编辑
ERIS期刊投稿
欢迎阅读创刊号
自然职场,为您触达千万科研人才
spring&清华大学出版社
城市可持续发展前沿研究专辑
Springer 纳米技术权威期刊征稿
全球视野覆盖
施普林格·自然新
chemistry
物理学研究前沿热点精选期刊推荐
自然职位线上招聘会
欢迎报名注册2020量子在线大会
化学领域亟待解决的问题
材料学研究精选新
GIANT
ACS ES&T Engineering
ACS ES&T Water
屿渡论文,编辑服务
ACS Publications填问卷
阿拉丁试剂right
苏州大学
林亮
南方科技大学
朱守非
胡少伟
有机所林亮
隐藏1h前已浏览文章
课题组网站
新版X-MOL期刊搜索和高级搜索功能介绍
ACS材料视界
上海纽约大学
浙江大学
廖矿标
天合科研
x-mol收录
试剂库存
down
wechat
bug