当前期刊: arXiv - CS - Distributed, Parallel, and Cluster Computing Go to current issue    加入关注   
显示样式:        排序: 导出
我的关注
我的收藏
您暂时未登录!
登录
  • Pipelined Backpropagation at Scale: Training Large Models without Batches
    arXiv.cs.DC Pub Date : 2020-03-25
    Atli Kosson; Vitaliy Chiley; Abhinav Venigalla; Joel Hestness; Urs Köster

    Parallelism is crucial for accelerating the training of deep neural networks. Pipeline parallelism can provide an efficient alternative to traditional data parallelism by allowing workers to specialize. Performing mini-batch SGD using pipeline parallelism has the overhead of filling and draining the pipeline. Pipelined Backpropagation updates the model parameters without draining the pipeline. This

    更新日期:2020-03-27
  • State-Machine Replication for Planet-Scale Systems (Extended Version)
    arXiv.cs.DC Pub Date : 2020-03-26
    Vitor Enes; Carlos Baquero; Tuanir França Rezende; Alexey Gotsman; Matthieu Perrin; Pierre Sutra

    Online applications now routinely replicate their data at multiple sites around the world. In this paper we present Atlas, the first state-machine replication protocol tailored for such planet-scale systems. Atlas does not rely on a distinguished leader, so clients enjoy the same quality of service independently of their geographical locations. Furthermore, client-perceived latency improves as we add

    更新日期:2020-03-27
  • Thalamo-cortical spiking model of incremental learning combining perception, context and NREM-sleep-mediated noise-resilience
    arXiv.cs.DC Pub Date : 2020-03-26
    Bruno Golosio; Chiara De Luca; Cristiano Capone; Elena Pastorelli; Giovanni Stegel; Gianmarco Tiddia; Giulia De Bonis; Pier Stanislao Paolucci

    The brain exhibits capabilities of fast incremental learning from few noisy examples, as well as the ability to associate similar memories in autonomously-created categories and to combine contextual hints with sensory perceptions. Together with sleep, these mechanisms are thought to be key components of many high-level cognitive functions. Yet, little is known about the underlying processes and the

    更新日期:2020-03-27
  • Implementing a GPU-based parallel MAX-MIN Ant System
    arXiv.cs.DC Pub Date : 2020-01-18
    Rafał Skinderowicz

    The MAX-MIN Ant System (MMAS) is one of the best-known Ant Colony Optimization (ACO) algorithms proven to be efficient at finding satisfactory solutions to many difficult combinatorial optimization problems. The slow-down in Moore's law, and the availability of graphics processing units (GPUs) capable of conducting general-purpose computations at high speed, has sparked considerable research efforts

    更新日期:2020-03-27
  • Fully Decentralized Joint Learning of Personalized Models and Collaboration Graphs
    arXiv.cs.DC Pub Date : 2019-01-24
    Valentina Zantedeschi; Aurélien Bellet; Marc Tommasi

    We consider the fully decentralized machine learning scenario where many users with personal datasets collaborate to learn models through local peer-to-peer exchanges, without a central coordinator. We propose to train personalized models that leverage a collaboration graph describing the relationships between user personal tasks, which we learn jointly with the models. Our fully decentralized optimization

    更新日期:2020-03-27
  • Efficient Concurrent Execution of Smart Contracts in Blockchains using Object-based Transactional Memory
    arXiv.cs.DC Pub Date : 2019-03-31
    Parwat Singh Anjana; Hagit Attiya; Sweta Kumari; Sathya Peri; Archit Somani

    This paper proposes an efficient framework to execute Smart Contract Transactions (SCTs) concurrently based on object semantics, using optimistic Single-Version Object-based Software Transactional Memory Systems (SVOSTMs) and Multi-Version OSTMs (MVOSTMs). In our framework, a multi-threaded miner constructs a Block Graph (BG), capturing the object-conflicts relations between SCTs, and stores it in

    更新日期:2020-03-27
  • Self-Repairing Hardware Architecture for Safety-Critical Cyber-Physical-Systems
    arXiv.cs.DC Pub Date : 2019-10-30
    Shawkat Khairullah; Carl Elks

    Digital embedded systems in safety-critical cyber-physical-systems (CPSs) require high levels of resilience and robustness against different fault classes. In recent years, self-healing concepts based on biological physiology have received attention for the design and implementation of reliable systems. However, many of these approaches have not been architected from the outset with safety in mind

    更新日期:2020-03-27
  • Probabilistic Dynamic Hard Real-Time Scheduling in HPC
    arXiv.cs.DC Pub Date : 2019-12-05
    Florian Hofer; Martin A. Sehr; Alberto Sangiovanni-Vincentelli; Barbara Russo

    Industry 4.0 is changing fundamentally the way data is collected, stored and analyzed in industrial processes. While this change enables novel application such as flexible manufacturing of highly customized products, the real-time control of these processes, however, has not yet realized its full potential. We believe that modern virtualization techniques, specifically \textit{application containers}

    更新日期:2020-03-27
  • Overview of the IBM Neural Computer Architecture
    arXiv.cs.DC Pub Date : 2020-03-25
    Pritish Narayanan; Charles E. Cox; Alexis Asseman; Nicolas Antoine; Harald Huels; Winfried W. Wilcke; Ahmet S. Ozcan

    The IBM Neural Computer (INC) is a highly flexible, re-configurable parallel processing system that is intended as a research and development platform for emerging machine intelligence algorithms and computational neuroscience. It consists of hundreds of programmable nodes, primarily based on Xilinx's Field Programmable Gate Array (FPGA) technology. The nodes are interconnected in a scalable 3d mesh

    更新日期:2020-03-26
  • NVMe and PCIe SSD Monitoring in Hyperscale Data Centers
    arXiv.cs.DC Pub Date : 2020-03-25
    Nikhil Khatri; Shirshendu Chakrabarti

    With low latency, high throughput and enterprise-grade reliability, SSDs have become the de-facto choice for storage in the data center. As a result, SSDs are used in all online data stores in LinkedIn. These apps persist and serve critical user data and have millisecond latencies. For the hosts serving these applications, SSD faults are the single largest cause of failure. Frequent SSD failures result

    更新日期:2020-03-26
  • Next-Generation Information Technology Systems for Fast Detectors in Electron Microscop
    arXiv.cs.DC Pub Date : 2020-03-25
    Dieter Weber; Alexander Clausen; Rafal E. Dunin-Borkowski

    The Gatan K2 IS direct electron detector (Gatan Inc., 2018), which was introduced in 2014, marked a watershed moment in the development of cameras for transmission electron microscopy (TEM) (Pan & Czarnik, 2016). Its pixel frequency, i.e. the number of data points (pixels) recorded per second, was two orders of magnitude higher than the fastest cameras available only five years before. Starting from

    更新日期:2020-03-26
  • Dynamo -- Handling Scientific Data Across Sites and Storage Media
    arXiv.cs.DC Pub Date : 2020-03-25
    Yutaro Iiyama; Benedikt Maier; Daniel Abercrombie; Maxim Goncharov; Christoph Paus

    Dynamo is a full-stack software solution for scientific data management. Dynamo's architecture is modular, extensible, and customizable, making the software suitable for managing data in a wide range of installation scales, from a few terabytes stored at a single location to hundreds of petabytes distributed across a worldwide computing grid. This article documents the core system design of Dynamo

    更新日期:2020-03-26
  • A Hybrid MPI+Threads Approach to Particle Group Finding Using Union-Find
    arXiv.cs.DC Pub Date : 2020-03-25
    James S. Willis; Matthieu Schaller; Pedro Gonnet; John C. Helly

    The Friends-of-Friends (FoF) algorithm is a standard technique used in cosmological $N$-body simulations to identify structures. Its goal is to find clusters of particles (called groups) that are separated by at most a cut-off radius. $N$-body simulations typically use most of the memory present on a node, leaving very little free for a FoF algorithm to run on-the-fly. We propose a new method that

    更新日期:2020-03-26
  • Almost Global Problems in the LOCAL Model
    arXiv.cs.DC Pub Date : 2018-05-12
    Alkida Balliu; Sebastian Brandt; Dennis Olivetti; Jukka Suomela

    The landscape of the distributed time complexity is nowadays well-understood for subpolynomial complexities. When we look at deterministic algorithms in the LOCAL model and locally checkable problems (LCLs) in bounded-degree graphs, the following picture emerges: - There are lots of problems with time complexities of $\Theta(\log^* n)$ or $\Theta(\log n)$. - It is not possible to have a problem with

    更新日期:2020-03-26
  • Deep Learning Training in Facebook Data Centers: Design of Scale-up and Scale-out Systems
    arXiv.cs.DC Pub Date : 2020-03-20
    Maxim Naumov; John Kim; Dheevatsa Mudigere; Srinivas Sridharan; Xiaodong Wang; Whitney Zhao; Serhat Yilmaz; Changkyu Kim; Hector Yuen; Mustafa Ozdal; Krishnakumar Nair; Isabel Gao; Bor-Yiing Su; Jiyan Yang; Mikhail Smelyanskiy

    Large-scale training is important to ensure high performance and accuracy of machine-learning models. At Facebook we use many different models, including computer vision, video and language models. However, in this paper we focus on the deep learning recommendation models (DLRMs), which are responsible for more than 50% of the training demand in our data centers. Recommendation models present unique

    更新日期:2020-03-24
  • An Energy-Aware Online Learning Framework for Resource Management in Heterogeneous Platforms
    arXiv.cs.DC Pub Date : 2020-03-20
    Sumit K. Mandal; Ganapati Bhat; Janardhan Rao Doppa; Partha Pratim Pande; Umit Y. Ogras

    Mobile platforms must satisfy the contradictory requirements of fast response time and minimum energy consumption as a function of dynamically changing applications. To address this need, system-on-chips (SoC) that are at the heart of these devices provide a variety of control knobs, such as the number of active cores and their voltage/frequency levels. Controlling these knobs optimally at runtime

    更新日期:2020-03-24
  • Resilience in Collaborative Optimization: Redundant and Independent Cost Functions
    arXiv.cs.DC Pub Date : 2020-03-21
    Nirupam Gupta; Nitin H. Vaidya

    This report considers the problem of Byzantine fault-tolerance in multi-agent collaborative optimization. In this problem, each agent has a local cost function. The goal of a collaborative optimization algorithm is to compute a minimum of the aggregate of the agents' cost functions. We consider the case when a certain number of agents may be Byzantine faulty. Such faulty agents may not follow a prescribed

    更新日期:2020-03-24
  • Message complexity of population protocols
    arXiv.cs.DC Pub Date : 2020-03-20
    Talley Amir; James Aspnes; David Doty; Mahsa Eftekhari H.; Eric Severson

    The standard population protocol model assumes that when two agents interact, each observes the entire state of the other agent. We initiate the study of the $\textbf{message complexity}$ for population protocols, where the state of an agent is divided into an externally-visible $\textbf{message}$ and an internal component, where only the message can be observed by the other agent in an interaction

    更新日期:2020-03-24
  • Towards an Enterprise-Ready Implementation of Artificial Intelligence-Enabled, Blockchain-Based Smart Contracts
    arXiv.cs.DC Pub Date : 2020-03-21
    Philipp BruneNeu-Ulm University of Applied Sciences, Neu-Ulm, Germany

    Blockchain technology and artificial intelligence (AI) are current hot topics in research and practice. However, the potentials of their combination have been studied just recently to a larger extend. While different use cases for combining AI and blockchain have been discussed, the idea of enabling blockchain-based smart contracts to perform "smarter" decisions by using AI or machine learning (ML)

    更新日期:2020-03-24
  • HierTrain: Fast Hierarchical Edge AI Learning with Hybrid Parallelism in Mobile-Edge-Cloud Computing
    arXiv.cs.DC Pub Date : 2020-03-22
    Deyin Liu; Xu Chen; Zhi Zhou; Qing Ling

    Nowadays, deep neural networks (DNNs) are the core enablers for many emerging edge AI applications. Conventional approaches to training DNNs are generally implemented at central servers or cloud centers for centralized learning, which is typically time-consuming and resource-demanding due to the transmission of a large amount of data samples from the device to the remote cloud. To overcome these disadvantages

    更新日期:2020-03-24
  • Being Fast Means Being Chatty: The Local Information Cost of Graph Spanners
    arXiv.cs.DC Pub Date : 2020-03-22
    Peter Robinson

    We introduce a new measure for quantifying the amount of information that the nodes in a network need to learn to jointly solve a graph problem. We show that the local information cost presents a natural lower bound on the communication complexity of distributed algorithms. We demonstrate the application of local information cost by deriving a lower bound on the communication complexity of computing

    更新日期:2020-03-24
  • On the scalability of CFD tool for supersonic jet flow configurations
    arXiv.cs.DC Pub Date : 2020-03-18
    Carlos Junqueira-Junior; João Luiz F. Azevedo; Jairo Panetta; William R. Wolf; Sami Yamouni

    New regulations are imposing noise emissions limitations for the aviation industry which are pushing researchers and engineers to invest efforts in studying the aeroacoustics phenomena. Following this trend, an in-house computational fluid dynamics tool is build to reproduce high fidelity results of supersonic jet flows for aeroacoustic analogy applications. The solver is written using the large eddy

    更新日期:2020-03-24
  • Distributed Computation with Continual Population Growth
    arXiv.cs.DC Pub Date : 2020-03-22
    Da-Jung Cho; Matthias Függer; Corbin Hopper; Manish Kushwaha; Thomas Nowak; Quentin Soubeyran

    Computing with synthetically modified bacteria is a vibrant and active field with numerous applications in bio-production, bio-sensors, and medicine. Recently, distributed approaches with communication among bacteria have gained interest, motivated by a lack of robustness and by resource limitations in single cells. In this paper, we focus on the problem of population growth happening in parallel,

    更新日期:2020-03-24
  • A Transactional Perspective on Execute-order-validate Blockchains
    arXiv.cs.DC Pub Date : 2020-03-23
    Pingcheng Ruan; Dumitrel Loghin; Quang-Trung Ta; Meihui Zhang; Gang Chen; Beng Chin Ooi

    Smart contracts have enabled blockchain systems to evolve from simple cryptocurrency platforms, such as Bitcoin, to general transactional systems, such as Ethereum. Catering for emerging business requirements, a new architecture called execute-order-validate has been proposed in Hyperledger Fabric to support parallel transactions and improve the blockchain's throughput. However, this new architecture

    更新日期:2020-03-24
  • Author's approach to the topological modeling of parallel computing systems
    arXiv.cs.DC Pub Date : 2020-03-23
    Victor A. Melent'ev

    The author's research of topologies of parallel computing systems and the tasks solved with them, including the corresponding tools of their modeling, is summarized in the present paper. The original topological model of such systems is presented based on the modified Amdahl law. It allowed formalizing the dependence of the necessary number of processors and the maximal distance between information-adjacent

    更新日期:2020-03-24
  • Soteria: A Provably Compliant User Right Manager Using a Novel Two-Layer Blockchain Technology
    arXiv.cs.DC Pub Date : 2020-03-23
    Wei-Kang Fu; Yi-Shan Lin; Giovanni Campagna; De-Yi Tsai; Chun-Ting Liu; Chung-Huan Mei; Edward Y. Chang; Monica S. Lam; Shih-Wei Liao

    Soteria is a user right management system designed to safeguard user-data privacy in a transparent and provable manner in compliance to regulations such as GDPR and CCPA. Soteria represents user data rights as formal executable sharing agreements, which can automatically be translated into a human readable form and enforced as data are queried. To support revocation and to prove compliance, an indelible

    更新日期:2020-03-24
  • A Unified Theory of Decentralized SGD with Changing Topology and Local Updates
    arXiv.cs.DC Pub Date : 2020-03-23
    Anastasia Koloskova; Nicolas Loizou; Sadra Boreiri; Martin Jaggi; Sebastian U. Stich

    Decentralized stochastic optimization methods have gained a lot of attention recently, mainly because of their cheap per iteration cost, data locality, and their communication-efficiency. In this paper we introduce a unified convergence analysis that covers a large variety of decentralized SGD methods which so far have required different intuitions, have different applications, and which have been

    更新日期:2020-03-24
  • Incentives in Ethereum's Hybrid Casper Protocol
    arXiv.cs.DC Pub Date : 2019-03-11
    Vitalik Buterin; Daniel Reijsbergen; Stefanos Leonardos; Georgios Piliouras

    We present an overview of hybrid Casper the Friendly Finality Gadget (FFG): a Proof-of-Stake checkpointing protocol overlaid onto Ethereum's Proof-of-Work blockchain. We describe its core functionalities and reward scheme, and explore its properties. Our findings indicate that Casper's implemented incentives mechanism ensures liveness, while providing safety guarantees that improve over standard Proof-of-Work

    更新日期:2020-03-24
  • Efficiency Guarantees for Parallel Incremental Algorithms under Relaxed Schedulers
    arXiv.cs.DC Pub Date : 2020-03-20
    Dan Alistarh; Nikita Koval; Giorgi Nadiradze

    Several classic problems in graph processing and computational geometry are solved via incremental algorithms, which split computation into a series of small tasks acting on shared state, which gets updated progressively. While the sequential variant of such algorithms usually specifies a fixed (but sometimes random) order in which the tasks should be performed, a standard approach to parallelizing

    更新日期:2020-03-24
  • DoubleSqueeze: Parallel Stochastic Gradient Descent with Double-Pass Error-Compensated Compression
    arXiv.cs.DC Pub Date : 2019-05-15
    Hanlin Tang; Xiangru Lian; Chen Yu; Tong Zhang; Ji Liu

    A standard approach in large scale machine learning is distributed stochastic gradient training, which requires the computation of aggregated stochastic gradients over multiple nodes on a network. Communication is a major bottleneck in such applications, and in recent years, compressed stochastic gradient methods such as QSGD (quantized SGD) and sparse SGD have been proposed to reduce communication

    更新日期:2020-03-24
  • Apps Gone Rogue: Maintaining Personal Privacy in an Epidemic
    arXiv.cs.DC Pub Date : 2020-03-19
    Ramesh Raskar; Isabel Schunemann; Rachel Barbar; Kristen Vilcans; Jim Gray; Praneeth Vepakomma; Suraj Kapa; Andrea Nuzzo; Rajiv Gupta; Alex Berke; Dazza Greenwood; Christian Keegan; Shriank Kanaparti; Robson Beaudry; David Stansbury; Beatriz Botero Arcila; Rishank Kanaparti; Vitor Pamplona; Francesco M Benedetti; Alina Clough; Riddhiman Das; Kaushal Jain; Khahlil Louisy; Greg Nadeau; Vitor Pamplona;

    Containment, the key strategy in quickly halting an epidemic, requires rapid identification and quarantine of the infected individuals, determination of whom they have had close contact with in the previous days and weeks, and decontamination of locations the infected individual has visited. Achieving containment demands accurate and timely collection of the infected individual's location and contact

    更新日期:2020-03-20
  • Strong Scaling of Numerical Solver for Supersonic Jet Flow Configuration
    arXiv.cs.DC Pub Date : 2020-03-19
    Carlos Junqueira-Junior; João Luiz F. Azevedo; Jairo Panetta; William R. Wolf; Sami Yamouni

    Acoustics loads are rocket design constraints which push researches and engineers to invest efforts in the aeroacoustics phenomena which is present on launch vehicles. Therefore, an in-house computational fluid dynamics tool is developed in order to reproduce high-fidelity results of supersonic jet flows for aeroacoustic analogy applications. The solver is written using the large eddy simulation formulation

    更新日期:2020-03-20
  • Utility Optimal Thread Assignment and Resource Allocation in Distributed Systems
    arXiv.cs.DC Pub Date : 2015-07-04
    Pan Lai; Rui Fan; Xiao Zhang; Wei Zhang; Fang Liu

    Achieving high performance in many distributed systems, such as a web hosting center or the cloud requires finding a good assignment of worker threads to servers and also effectively allocating each server's resources to its assigned threads. The assignment and allocation components of this problem have been studied extensively but largely separately in the literature. In this paper, we introduce the

    更新日期:2020-03-20
  • Towards Peer-to-Peer Energy Market: an Overview
    arXiv.cs.DC Pub Date : 2020-03-02
    Ramon Christen; Luca Mazzola; Alexander Denzler

    This paper provides an overview of the current status of the energy market, with respect to the increasing number of decentralised prosumer. After outlining the limitations imposed by the status quo, a possible multi-layered architecture of a Peer-to-Peer (P2P) energy market is introduced. The fundamental aspects of local production and local consumption as part of a microgrid are discussed. Changes

    更新日期:2020-03-20
  • Co-Optimizing Performance and Memory FootprintVia Integrated CPU/GPU Memory Management, anImplementation on Autonomous Driving Platform
    arXiv.cs.DC Pub Date : 2020-03-17
    Soroush Bateni; Zhendong Wang; Yuankun Zhu; Yang Hu; Cong Liu

    Cutting-edge embedded system applications, such as self-driving cars and unmanned drone software, are reliant on integrated CPU/GPU platforms for their DNNs-driven workload, such as perception and other highly parallel components. In this work, we set out to explore the hidden performance implication of GPU memory management methods of integrated CPU/GPU architecture. Through a series of experiments

    更新日期:2020-03-19
  • ContainerStress: Autonomous Cloud-Node Scoping Framework for Big-Data ML Use Cases
    arXiv.cs.DC Pub Date : 2020-03-18
    Guang Chao Wang; Kenny Gross; Akshay Subramaniam

    Deploying big-data Machine Learning (ML) services in a cloud environment presents a challenge to the cloud vendor with respect to the cloud container configuration sizing for any given customer use case. OracleLabs has developed an automated framework that uses nested-loop Monte Carlo simulation to autonomously scale any size customer ML use cases across the range of cloud CPU-GPU "Shapes" (configurations

    更新日期:2020-03-19
  • Gradient Estimation for Federated Learning over Massive MIMO Communication Systems
    arXiv.cs.DC Pub Date : 2020-03-18
    Yo-Seb Jeon; Mohammad Mohammadi Amiri; Jun Li; H. Vincent Poor

    Federated learning is a communication-efficient and privacy-preserving solution to train a global model through the collaboration of multiple devices each with its own local training data set. In this paper, we consider federated learning over massive multiple-input multiple-output (MIMO) communication systems in which wireless devices train a global model with the aid of a central server equipped

    更新日期:2020-03-19
  • On the Analysis of Parallel Real-Time Tasks with Spin Locks
    arXiv.cs.DC Pub Date : 2020-03-18
    Xu Jiang; Nan Guan; He Du; Weichen Liu; Wang Yi

    Locking protocol is an essential component in resource management of real-time systems, which coordinates mutually exclusive accesses to shared resources from different tasks. Although the design and analysis of locking protocols have been intensively studied for sequential real-time tasks, there has been little work on this topic for parallel real-time tasks. In this paper, we study the analysis of

    更新日期:2020-03-19
  • From Sensor to Processing Networks: Optimal Estimation with Computation and Communication Latency
    arXiv.cs.DC Pub Date : 2020-03-16
    Luca Ballotta; Luca Schenato; Luca Carlone

    This paper investigates the use of a networked system ($e.g.$, swarm of robots, smart grid, sensor network) to monitor a time-varying phenomenon of interest in the presence of communication and computation latency. Recent advances in edge computing have enabled processing to be spread across the network, hence we investigate the fundamental computation-communication trade-off, arising when a sensor

    更新日期:2020-03-19
  • Cross Architectural Power Modelling
    arXiv.cs.DC Pub Date : 2020-03-17
    Kai Chen; Peter Kilpatrick; Dimitrios S. Nikolopoulos; Blesson Varghese

    Existing power modelling research focuses on the model rather than the process for developing models. An automated power modelling process that can be deployed on different processors for developing power models with high accuracy is developed. For this, (i) an automated hardware performance counter selection method that selects counters best correlated to power on both ARM and Intel processors, (ii)

    更新日期:2020-03-19
  • Vermillion: A High-Performance Scalable IoT Middleware for Smart Cities
    arXiv.cs.DC Pub Date : 2020-03-14
    Poorna Chandra Tejasvi; Vasanth Rajaraman; Arun Babu Puthuparambil; Akhil Pankaj; Bharadwaj Amrutur

    With the massive increase in the number of IoT devices being deployed in smart cities, it becomes paramount for middlewares to be able to handle very high loads and support demanding use-cases. In order to do so, middlewares must scale horizontally while providing a commensurate increase in availability and throughput. Currently, most open-source IoT middlewares do not provide out-of-the-box support

    更新日期:2020-03-19
  • Dynamic Budget Management with Service Guarantees for Mixed-Criticality Systems
    arXiv.cs.DC Pub Date : 2020-03-11
    Xiaozhe Gu; Arvind Easwaran

    Many existing studies on mixed-criticality (MC) scheduling assume that low-criticality budgets for high-criticality applications are known apriori. These budgets are primarily used as guidance to determine when the scheduler should switch the system mode from low to high. Based on this key observation, in this paper we propose a dynamic MC scheduling model under which low-criticality budgets for individual

    更新日期:2020-03-19
  • When parallel speedups hit the memory wall
    arXiv.cs.DC Pub Date : 2019-05-03
    Alex F. A. Furtunato; Kyriakos Georgiou; Kerstin Eder; Samuel Xavier-de-Souza

    After Amdahl's trailblazing work, many other authors proposed analytical speedup models but none have considered the limiting effect of the memory wall. These models exploited aspects such as problem-size variation, memory size, communication overhead, and synchronization overhead, but data-access delays are assumed to be constant. Nevertheless, such delays can vary, for example, according to the number

    更新日期:2020-03-19
  • Optimal Energy Efficiency with Delay Constraints for Multi-layer Cooperative Fog Computing Networks
    arXiv.cs.DC Pub Date : 2019-06-09
    Thai T. Vu; Diep N. Nguyen; Dinh Thai Hoang; Eryk Dutkiewicz; Thuy V. Nguyen

    We develop a joint offloading and resource allocation framework for a multi-layer cooperative fog computing network, aiming to minimize the total energy consumption of multiple mobile devices subject to their service delay requirements. The resulting optimization involves both binary (offloading decisions) and real variables (resource allocations), making it an NP-hard and computationally intractable

    更新日期:2020-03-19
  • Decentralized Deep Learning with Arbitrary Communication Compression
    arXiv.cs.DC Pub Date : 2019-07-22
    Anastasia Koloskova; Tao Lin; Sebastian U. Stich; Martin Jaggi

    Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks, as well as for efficient scaling to large compute clusters. As current approaches suffer from limited bandwidth of the network, we propose the use of communication compression in the decentralized training context. We show that Choco-SGD $-$ recently introduced and analyzed

    更新日期:2020-03-19
  • First Analysis of Local GD on Heterogeneous Data
    arXiv.cs.DC Pub Date : 2019-09-10
    Ahmed Khaled; Konstantin Mishchenko; Peter Richtárik

    We provide the first convergence analysis of local gradient descent for minimizing the average of smooth and convex but otherwise arbitrary functions. Problems of this form and local gradient descent as a solution method are of importance in federated learning, where each function is based on private data stored by a user on a mobile device, and the data of different users can be arbitrarily heterogeneous

    更新日期:2020-03-19
  • Gradient Descent with Compressed Iterates
    arXiv.cs.DC Pub Date : 2019-09-10
    Ahmed Khaled; Peter Richtárik

    We propose and analyze a new type of stochastic first order method: gradient descent with compressed iterates (GDCI). GDCI in each iteration first compresses the current iterate using a lossy randomized compression technique, and subsequently takes a gradient step. This method is a distillation of a key ingredient in the current practice of federated learning, where a model needs to be compressed by

    更新日期:2020-03-19
  • The OpenUAV Swarm Simulation Testbed: a Collaborative Design Studio for Field Robotics
    arXiv.cs.DC Pub Date : 2019-10-02
    Harish Anand; Zhiang Chen; Sarah Bearman; Prasad Antervedi; Devin Keating; Stephen A. Rees; Jnaneshwar Das

    In this paper, we describe our OpenUAV multi-robot design studio that enables simulations to run as browser accessible Lubuntu desktop containers. Our simulation testbed, based on ROS, Gazebo, and PX4 flight stack has been developed to facilitate collaborative mission planning, and serve as a sandbox for vision-based problems, collision avoidance, and multi-robot coordination for Unmanned Aircraft

    更新日期:2020-03-19
  • A Flexible n/2 Adversary Node Resistant and Halting Recoverable Blockchain Sharding Protocol
    arXiv.cs.DC Pub Date : 2020-03-16
    Yibin Xu; Yangyu Huang; Jianhua Shao; George Theodorakopoulos

    Blockchain sharding is a promising approach to solving the dilemma between decentralisation and high performance (transaction throughput) for blockchain. The main challenge of Blockchain sharding systems is how to reach a decision on a statement among a sub-group (shard) of people while ensuring the whole population recognises this statement. Namely, the challenge is to prevent an adversary who does

    更新日期:2020-03-19
  • Adapting Persistent Data Structures for Concurrency and Speculation
    arXiv.cs.DC Pub Date : 2020-03-16
    Thomas Dickerson

    This work unifies insights from the systems and functional programming communities, in order to enable compositional reasoning about software which is nonetheless efficiently realizable in hardware. It exploits a correspondence between design goals for efficient concurrent data structures and efficient immutable persistent data structures, to produce novel implementations of mutable concurrent trees

    更新日期:2020-03-18
  • Beyond Alice and Bob: Improved Inapproximability for Maximum Independent Set in CONGEST
    arXiv.cs.DC Pub Date : 2020-03-16
    Yuval Efron; Ofer Grossman; Seri Khoury

    By far the most fruitful technique for showing lower bounds for the CONGEST model is reductions to two-party communication complexity. This technique has yielded nearly tight results for various fundamental problems such as distance computations, minimum spanning tree, minimum vertex cover, and more. In this work, we take this technique a step further, and we introduce a framework of reductions to

    更新日期:2020-03-18
  • The Power of Global Knowledge on Self-stabilizing Population Protocols
    arXiv.cs.DC Pub Date : 2020-03-17
    Yuichi Sudo; Masahiro Shibata; Junya Nakamura; Yonghwan Kim; Toshimitsu Masuzawa

    In the population protocol model, many problems cannot be solved in a self-stabilizing way. However, global knowledge, such as the number of nodes in a network, sometimes allows us to design a self-stabilizing protocol for such problems. In this paper, we investigate the effect of global knowledge on the possibility of self-stabilizing population protocols in arbitrary graphs. Specifically, we clarify

    更新日期:2020-03-18
  • PigPaxos: Devouring the communication bottlenecks in distributed consensus
    arXiv.cs.DC Pub Date : 2020-03-17
    Aleksey Charapko; Ailidani Ailijiang; Murat Demirbas

    Paxos family of protocols are employed by many cloud computing services and distributed databases due to their excellent fault-tolerance properties. Unfortunately, current Paxos deployments do not scale for more than a dozen nodes due to the communication bottleneck at the leader. PigPaxos addresses this problem by decoupling the communication from the decision-making at the leader. To this end, PigPaxos

    更新日期:2020-03-18
  • Store-Collect in the Presence of Continuous Churn with Application to Snapshots and Lattice Agreement
    arXiv.cs.DC Pub Date : 2020-03-17
    Hagit Attiya; Sweta Kumari; Archit Somani; Jennifer L. Welch

    We present an algorithm for implementing a store-collect object in an asynchronous crash-prone message-passing dynamic system, where nodes continually enter and leave. The algorithm is very simple and efficient, requiring just one round trip for a store operation and two for a collect. We then show the versatility of the store-collect object for implementing churn-tolerant versions of useful data structures

    更新日期:2020-03-18
  • Machine Learning Pipelines with Modern Big Data Tools for High Energy Physics
    arXiv.cs.DC Pub Date : 2019-09-23
    Matteo Migliorini; Riccardo Castellotti; Luca Canali; Marco Zanetti

    The effective utilization at scale of complex machine learning (ML) techniques for HEP use cases poses several technological challenges, most importantly on the actual implementation of dedicated end-to-end data pipelines. A solution to these challenges is presented, which allows training neural network classifiers using solutions from the Big Data and data science ecosystems, integrated with tools

    更新日期:2020-03-18
  • A Fault-Tolerance Shim for Serverless Computing
    arXiv.cs.DC Pub Date : 2020-03-12
    Vikram Sreekanti; Chenggang Wu; Saurav Chhatrapati; Joseph E. Gonzalez; Joseph M. Hellerstein; Jose M. Faleiro

    Serverless computing has grown in popularity in recent years, with an increasing number of applications being built on Functions-as-a-Service (FaaS) platforms. By default, FaaS platforms support retry-based fault tolerance, but this is insufficient for programs that modify shared state, as they can unwittingly persist partial sets of updates in case of failures. To address this challenge, we would

    更新日期:2020-03-16
  • Characterizing Optimizations to Memory Access Patterns using Architecture-Independent Program Features
    arXiv.cs.DC Pub Date : 2020-03-12
    Aditya Chilukuri; Josh Milthorpe; Beau Johnston

    High-performance computing developers are faced with the challenge of optimizing the performance of OpenCL workloads on diverse architectures. The Architecture-Independent Workload Characterization (AIWC) tool is a plugin for the Oclgrind OpenCL simulator that gathers metrics of OpenCL programs that can be used to understand and predict program performance on an arbitrary given hardware architecture

    更新日期:2020-03-16
  • On Exploiting Transaction Concurrency To Speed Up Blockchains
    arXiv.cs.DC Pub Date : 2020-03-13
    Daniël Reijsbergen; Tien Tuan Anh Dinh

    Consensus protocols are currently the bottlenecks that prevent blockchain systems from scaling. However, we argue that transaction execution is also important to the performance and security of blockchains. In other words, there are ample opportunities to speed up and further secure blockchains by reducing the cost of transaction execution. Our goal is to understand how much we can speed up blockchains

    更新日期:2020-03-16
  • Communication-Efficient Distributed Deep Learning: A Comprehensive Survey
    arXiv.cs.DC Pub Date : 2020-03-10
    Zhenheng Tang; Shaohuai Shi; Xiaowen Chu; Wei Wang; Bo Li

    Distributed deep learning becomes very common to reduce the overall training time by exploiting multiple computing devices (e.g., GPUs/TPUs) as the size of deep models and data sets increases. However, data communication between computing devices could be a potential bottleneck to limit the system scalability. How to address the communication problem in distributed deep learning is becoming a hot research

    更新日期:2020-03-16
  • An Analysis of Blockchain Consistency in Asynchronous Networks: Deriving a Neat Bound
    arXiv.cs.DC Pub Date : 2019-09-14
    Jun Zhao; Jing Tang; Li Zengxiang; Huaxiong Wang; Kwok-Yan Lam; Kaiping Xue

    Formal analyses of blockchain protocols have received much attention recently. Consistency results of Nakamoto's blockchain protocol are often expressed in a quantity $c$, which denotes the expected number of network delays before some block is mined. With $\mu$ (resp., $\nu$) denoting the fraction of computational power controlled by benign miners (resp., the adversary), where $\mu + \nu = 1$, we

    更新日期:2020-03-16
Contents have been reproduced by permission of the publishers.
导出
全部期刊列表>>
全球疫情及响应:BMC Medicine专题征稿
欢迎探索2019年最具下载量的化学论文
新版X-MOL期刊搜索和高级搜索功能介绍
化学材料学全球高引用
ACS材料视界
南方科技大学
x-mol收录
南方科技大学
自然科研论文编辑服务
上海交通大学彭文杰
中国科学院长春应化所于聪-4-8
武汉工程大学
课题组网站
X-MOL
深圳大学二维材料实验室张晗
中山大学化学工程与技术学院
试剂库存
天合科研
down
wechat
bug