• arXiv.cs.MA Pub Date : 2020-06-30
Lama Alfaseeh; Bilal Farooq

This study exploits the advancements in information and communication technology (ICT), connected and automated vehicles (CAVs), and sensing, to develop anticipatory multi-objective eco-routing strategies. For a robust application, several GHG costing approaches are examined. The predictive models for the link level traffic and emission states are developed using long short term memory deep network

更新日期：2020-07-01
• arXiv.cs.MA Pub Date : 2020-06-29
Juste Raimbault

An increased interdisciplinarity in science projects has been highlighted as crucial to tackle complex real-world challenges, but also as beneficial for the development of disciplines themselves. This paper introduces a parcimonious agent-based model of interdisciplinary relationships in collective entreprises of knowledge discovery, to investigate the impact of scientist-level decisions and preferences

更新日期：2020-07-01
• arXiv.cs.MA Pub Date : 2020-06-29
Shuyue Hu; Chin-Wing Leung; Ho-fung Leung; Harold Soh

Understanding the evolutionary dynamics of reinforcement learning under multi-agent settings has long remained an open problem. While previous works primarily focus on 2-player games, we consider population games, which model the strategic interactions of a large population comprising small and anonymous agents. This paper presents a formal relation between stochastic processes and the dynamics of

更新日期：2020-06-30
• arXiv.cs.MA Pub Date : 2020-06-29
Zahi M. Kakish; Karthik Elamvazhuthi; Spring Berman

In this paper, we present a reinforcement learning approach to designing a control policy for a "leader'' agent that herds a swarm of "follower'' agents, via repulsive interactions, as quickly as possible to a target probability distribution over a strongly connected graph. The leader control policy is a function of the swarm distribution, which evolves over time according to a mean-field model in

更新日期：2020-06-30
• arXiv.cs.MA Pub Date : 2020-06-26
Serge Plata; Sumanas Sarma; Melvin Lancelot; Kristine Bagrova; David Romano-Critchley

Taking the context of simulating a retail environment using agent based modelling, a theoretical model is presented that describes the probability distribution of customer "collisions" using a novel space transformation to the Torus $Tor^2$. A method for generating the distribution of customer paths based on historical basket data is developed. Finally a calculation of the number of simulations required

更新日期：2020-06-30
• arXiv.cs.MA Pub Date : 2020-06-25
Murat Cubuktepe; Zhe Xu; Ufuk Topcu

We study the distributed synthesis of policies for multi-agent systems to perform spatial-temporal tasks. We formalize the synthesis problem as a factored Markov decision process subject to graph temporal logic specifications. The transition function and task of each agent is a function of the agent itself and its neighboring agents. By leveraging the structure in the model, and the specifications

更新日期：2020-06-29
• arXiv.cs.MA Pub Date : 2020-06-26
Francesco Belardinelli; Catalin Dima; Vadim Malvone; Ferucio Tiplea

We show that a history-based variant of alternating bisimulation with imperfect information allows it to be related to a variant of Alternating-time Temporal Logic (ATL) with imperfect information by a full Hennessy-Milner theorem. The variant of ATL we consider has a common knowledge semantics, which requires that the uniform strategy available for a coalition to accomplish some goal must be common

更新日期：2020-06-29
• arXiv.cs.MA Pub Date : 2020-06-25
Nathan A. Brooks; Simon T. Powers; James M. Borg

Reducing the peak energy consumption of households is essential for the effective use of renewable energy sources, in order to ensure that as much household demand as possible can be met by renewable sources. This entails spreading out the use of high-powered appliances such as dishwashers and washing machines throughout the day. Traditional approaches to this problem have relied on differential pricing

更新日期：2020-06-26
• arXiv.cs.MA Pub Date : 2020-06-25
Frank Schweitzer; Yan Zhang; Giona Casiraghi

We investigate a multi-agent model of firms in an R\&D network. Each firm is characterized by its knowledge stock $x_{i}(t)$, which follows a non-linear dynamics. It can grow with the input from other firms, i.e., by knowledge transfer, and decays otherwise. Maintaining interactions is costly. Firms can leave the network if their expected knowledge growth is not realized, which may cause other firms

更新日期：2020-06-26
• arXiv.cs.MA Pub Date : 2020-06-25
Yifan Mao; Soubhik Deb; Shaileshh Bojja Venkatakrishnan; Sreeram Kannan; Kannan Srinivasan

A key performance metric in blockchains is the latency between when a transaction is broadcast and when it is confirmed (the so-called, confirmation latency). While improvements in consensus techniques can lead to lower confirmation latency, a fundamental lower bound on confirmation latency is the propagation latency of messages through the underlying peer-to-peer (p2p) network (inBitcoin, the propagation

更新日期：2020-06-26
• arXiv.cs.MA Pub Date : 2020-06-24
Andrea Angiuli; Jean-Pierre Fouque; Mathieu Laurière

We present a Reinforcement Learning (RL) algorithm to solve infinite horizon asymptotic Mean Field Game (MFG) and Mean Field Control (MFC) problems. Our approach can be described as a unified two-timescale Mean Field Q-learning: The same algorithm can learn either the MFG or the MFC solution by simply tuning a parameter. The algorithm is in discrete time and space where the agent not only provides

更新日期：2020-06-25
• arXiv.cs.MA Pub Date : 2020-06-24
Sofia M Nikolakaki; Ogheneovo Dibie; Ahmad Beirami; Nicholas Peterson; Navid Aghdaie; Kazi Zaman

Competition is a primary driver of player satisfaction and engagement in multiplayer online games. Traditional matchmaking systems aim at creating matches involving teams of similar aggregated individual skill levels, such as Elo score or TrueSkill. However, team dynamics cannot be solely captured using such linear predictors. Recently, it has been shown that nonlinear predictors that target to learn

更新日期：2020-06-25
• arXiv.cs.MA Pub Date : 2020-06-24
Virginia Bordignon; Vincenzo Matta; Ali H. Sayed

This work addresses the problem of sharing partial information within social learning strategies. In traditional social learning, agents solve a distributed multiple hypothesis testing problem by performing two operations at each instant: first, agents incorporate information from private observations to form their beliefs over a set of hypotheses; second, agents combine the entirety of their beliefs

更新日期：2020-06-25
• arXiv.cs.MA Pub Date : 2020-06-24
Ehsan Asali; Farzin Negahbani; Shahriyar Bamaei; Zahra Abbasi

In this article, we will discuss methods and ideas which are implemented on Namira 2D Soccer Simulation team in the recent year. Numerous scientific and programming activities were done in the process of code development, but we will mention the most outstanding ones in details. A Kalman filtering method for localization and two helpful software packages will be discussed here. Namira uses agent2d-3

更新日期：2020-06-25
• arXiv.cs.MA Pub Date : 2020-06-23
Ding Wang; Brian Yueshuai He; Jingqin Gao; Joseph Y. J. Chow; Kaan Ozbay; Shri Iyer

The COVID-19 pandemic has affected travel behaviors and transportation system operations, and cities are grappling with what policies can be effective for a phased reopening shaped by social distancing. A baseline model was previously developed and calibrated for pre-COVID conditions as MATSim-NYC. A new COVID model is calibrated that represents travel behavior during the COVID-19 pandemic by recalibrating

更新日期：2020-06-25
• arXiv.cs.MA Pub Date : 2020-06-12

Considering the players' bargaining power, designing a bi-level programming model is suitable to reflect the hierarchical nature of the decision-making process. In this paper, typical negotiation components perfectly match with the mathematical model and its solution procedure. For this purpose, a mathematical negotiation mechanism is designed to minimize the negotiators' costs in a distributed procurement

更新日期：2020-06-24
• arXiv.cs.MA Pub Date : 2020-06-23
Zijia Zhong; Mark Nejad; Earl E. Lee

Intersection is a major source of traffic delays and accidents within modern transportation systems. Compared to signalized intersection management, autonomous intersection management (AIM) coordinates the intersection crossing at an individual vehicle level with additional flexibility. AIM can potentially eliminate stopping in intersection crossing due to traffic lights while maintaining a safe separation

更新日期：2020-06-24
• arXiv.cs.MA Pub Date : 2020-06-16

With the advent of evolution of cloud computing, large organizations have been scaling the on-premise IT infrastructure to the cloud. Although this being a popular practice, it lacks comprehensive efforts to study the aspects of automated negotiation of resources among cloud customers and providers. This paper proposes a full-fledged framework for the multi-party, multi-issue negotiation system for

更新日期：2020-06-24
• arXiv.cs.MA Pub Date : 2020-06-23
Nelson Vadori; Sumitra Ganesh; Prashant Reddy; Manuela Veloso

Training multi-agent systems (MAS) to achieve realistic equilibria gives us a useful tool to understand and model real-world systems. We consider a general sum partially observable Markov game where agents of different types share a single policy network, conditioned on agent-specific information. This paper aims at i) formally understanding equilibria reached by such agents, and ii) matching emergent

更新日期：2020-06-24
• arXiv.cs.MA Pub Date : 2020-06-23
Yongxin Wang; Xinshuo Weng; Kris Kitani

Object detection and data association are critical components in multi-object tracking (MOT) systems. Despite the fact that these two components are highly dependent on each other, one popular trend in MOT is to perform detection and data association as separate modules, processed in a cascaded order. Due to this cascaded process, the resulting MOT system can only perform forward inference and cannot

更新日期：2020-06-24
• arXiv.cs.MA Pub Date : 2020-06-23
Haotian Liu; Wenchuan Wu

The distributed Volt/Var control (VVC) methods have been widely studied for active distribution networks(ADNs), which is based on perfect model and real-time P2P communication. However, the model is always incomplete with significant parameter errors and such P2P communication system is hard to maintain. In this paper, we propose an online multi-agent reinforcement learning and decentralized control

更新日期：2020-06-24
• arXiv.cs.MA Pub Date : 2020-06-21
Cleber Jorge Amaral; Stephen Cranefield; Jomi Fred Hübner; Mario Lucio Roloff

There are many challenges for building up the smart factory, among them to deal with distributed data, high volume of information, and wide diversity of devices and applications. In this sense, Cyber-Physical System (CPS) concept emerges to virtualize and integrate factory resources. Based on studies that use Multi-Agent System as the core of a CPS, in this paper, we show that many resources of the

更新日期：2020-06-23
• arXiv.cs.MA Pub Date : 2020-06-22
Shaocheng Luo; Jonghoek Kim; Byung-Cheol Min

Harmful marine spills, such as algae blooms and oil spills, damage ecosystems and threaten public health tremendously. Hence, an effective spill coverage and removal strategy will play a significant role in environmental protection. In recent years, low-cost water surface robots have emerged as a solution, with their efficacy verified at small scale. However, practical limitations such as connectivity

更新日期：2020-06-23
• arXiv.cs.MA Pub Date : 2020-06-22
Ariah Klages-Mundt; Dominik Harz; Lewis Gudgeon; Jun-You Liu; Andreea Minca

Stablecoins are one of the most widely capitalized type of cryptocurrency. However, their risks vary significantly according to their design and are often poorly understood. In this paper, we seek to provide a sound foundation for stablecoin theory, with a risk-based functional characterization of the economic structure of stablecoins. First, we match existing economic models to the disparate set of

更新日期：2020-06-23
• arXiv.cs.MA Pub Date : 2020-06-22
Aniq Ur Rahman; Gourab Ghatak; Antonio De Domenico

We consider the latency minimization problem in a task-offloading scenario, where multiple servers are available to the user equipment for outsourcing computational tasks. To account for the temporally dynamic nature of the wireless links and the availability of the computing resources, we model the server selection as a multi-armed bandit (MAB) problem. In the considered MAB framework, rewards are

更新日期：2020-06-23
• arXiv.cs.MA Pub Date : 2020-06-21
Santiago Cuervo; Marco Alzate

With artificial intelligence systems becoming ubiquitous in our society, its designers will soon have to start to consider its social dimension, as many of these systems will have to interact among them to work efficiently. With this in mind, we propose a decentralized deep reinforcement learning algorithm for the design of cooperative multi-agent systems. The algorithm is based on the hypothesis that

更新日期：2020-06-23
• arXiv.cs.MA Pub Date : 2020-06-21
Kiyeob Lee; Desik Rengarajan; Dileep Kalathil; Srinivas Shakkottai

Mean Field Games (MFG) are those in which each agent assumes that the states of all others are drawn in an i.i.d. manner from a common belief distribution, and optimizes accordingly. The equilibrium concept here is a Mean Field Equilibrium (MFE), and algorithms for learning MFE in dynamic MFGs are unknown in general due to the non-stationary evolution of the belief distribution. Our focus is on an

更新日期：2020-06-23
• arXiv.cs.MA Pub Date : 2020-06-20

Combining the predictions of collections of neural networks often outperforms the best single network. Such ensembles are typically trained independently, and their superior `wisdom of the crowd' originates from the differences between networks. Collective foraging and decision making in socially interacting animal groups is often improved or even optimal thanks to local information sharing between

更新日期：2020-06-23
• arXiv.cs.MA Pub Date : 2020-06-19
Sheng Li; Jayesh K. Gupta; Peter Morales; Ross Allen; Mykel J. Kochenderfer

Multi-agent reinforcement learning (MARL) requires coordination to efficiently solve certain tasks. Fully centralized control is often infeasible in such domains due to the size of joint action spaces. Coordination graph based formalization allows reasoning about the joint action based on the structure of interactions. However, they often require domain expertise in their design. This paper introduces

更新日期：2020-06-23
• arXiv.cs.MA Pub Date : 2020-06-16
Tarun Chitra; Alex Evans

As smart contract platforms autonomously manage billions of dollars of capital, quantifying the portfolio risk that investors engender in these systems is increasingly important. Recent work illustrates that Proof of Stake (PoS) is vulnerable to financial attacks arising from on-chain lending and has worse capital efficiency than Proof of Work (PoW) \cite{fanti_pos_econ}. Numerous methods for improving

更新日期：2020-06-22
• arXiv.cs.MA Pub Date : 2020-06-19
Antonis Bikakis; Patrice Caire

In multiagent systems, agents often have to rely on other agents to reach their goals, for example when they lack a needed resource or do not have the capability to perform a required action. Agents therefore need to cooperate. Then, some of the questions raised are: Which agent(s) to cooperate with? What are the potential coalitions in which agents can achieve their goals? As the number of possibilities

更新日期：2020-06-22
• arXiv.cs.MA Pub Date : 2020-06-18
Oscar de Lima; Hansal Shah; Ting-Sheng Chu; Brian Fogelson

With the advent of ride-sharing services, there is a huge increase in the number of people who rely on them for various needs. Most of the earlier approaches tackling this issue required handcrafted functions for estimating travel times and passenger waiting times. Traditional Reinforcement Learning (RL) based methods attempting to solve the ridesharing problem are unable to accurately model the complex

更新日期：2020-06-22
• arXiv.cs.MA Pub Date : 2020-06-18
Tabish Rashid; Gregory Farquhar; Bei Peng; Shimon Whiteson

QMIX is a popular $Q$-learning algorithm for cooperative MARL in the centralised training and decentralised execution paradigm. In order to enable easy decentralisation, QMIX restricts the joint action $Q$-values it can represent to be a monotonic mixing of each agent's utilities. However, this restriction prevents it from representing value functions in which an agent's ordering over its actions can

更新日期：2020-06-22
• arXiv.cs.MA Pub Date : 2020-06-18
Jesse MulderijDelft University of Technology; Bob HuismanNederlandse Spoorwegen; Denise TönissenVrije Universiteit Amsterdam; Koos van der LindenDelft University of Technology; Mathijs de WeerdtDelft University of Technology

In between transportation services, trains are parked and maintained at shunting yards. The conflict-free routing of trains to and on these yards and the scheduling of service and maintenance tasks is known as the train unit shunting and service problem. Efficient use of the capacity of these yards is becoming increasingly important, because of increasing numbers of trains without proportional extensions

更新日期：2020-06-19
• arXiv.cs.MA Pub Date : 2020-06-18
Denis J. S. de Albuquerque; Vanessa Tavares Nunes; Claudia Cappelli; Celia Ghedini Ralha

Transparency is an important factor in democratic societies composed of characteristics such as accessibility, usability, informativeness, understandability and auditability. In this research we focus on auditability since it plays an important role for citizens that need to understand and audit public information. Although auditability has been a subject of discussion when designing systems, there

更新日期：2020-06-19
• arXiv.cs.MA Pub Date : 2020-06-18
Yao Xiao; Mofeng Yang; Zheng Zhu; Hai Yang; Lei Zhang; Sepehr Ghader

Mathematical modeling of epidemic spreading has been widely adopted to estimate the threats of epidemic diseases (i.e., the COVID-19 pandemic) as well as to evaluate epidemic control interventions. The indoor place is considered to be a significant epidemic spreading risk origin, but existing widely-used epidemic spreading models are usually limited for indoor places since the dynamic physical distance

更新日期：2020-06-19
• arXiv.cs.MA Pub Date : 2020-06-18
Manish Prajapat; Kamyar Azizzadenesheli; Alexander Liniger; Yisong Yue; Anima Anandkumar

A core challenge in policy optimization in competitive Markov decision processes is the design of efficient optimization methods with desirable convergence and stability properties. To tackle this, we propose competitive policy optimization (CoPO), a novel policy gradient approach that exploits the game-theoretic nature of competitive games to derive policy updates. Motivated by the competitive gradient

更新日期：2020-06-19
• arXiv.cs.MA Pub Date : 2020-06-18
Arrasy Rahman; Niklas Hopner; Filippos Christianos; Stefano V. Albrecht

Ad hoc teamwork is the challenging problem of designing an autonomous agent which can adapt quickly to collaborate with previously unknown teammates. Prior work in this area has focused on closed teams in which the number of agents is fixed. In this work, we consider open teams by allowing agents of varying types to enter and leave the team without prior notification. Our proposed solution builds on

更新日期：2020-06-19
• arXiv.cs.MA Pub Date : 2020-06-18
Aritra Guha; Rayleigh Lei; Jiacheng Zhu; XuanLong Nguyen; Ding Zhao

Robust representation learning of temporal dynamic interactions is an important problem in robotic learning in general and automated unsupervised learning in particular. Temporal dynamic interactions can be described by (multiple) geometric trajectories in a suitable space over which unsupervised learning techniques may be applied to extract useful features from raw and high-dimensional data measurements

更新日期：2020-06-19
• arXiv.cs.MA Pub Date : 2020-06-12
Sam Ganzfried

Successful algorithms have been developed for computing Nash equilibrium in a variety of finite game classes. However, solving continuous games---in which the pure strategy space is (potentially uncountably) infinite---is far more challenging. Nonetheless, many real-world domains have continuous action spaces, e.g., where actions refer to an amount of time, money, or other resource that is naturally

更新日期：2020-06-19
• arXiv.cs.MA Pub Date : 2020-06-17
Vaclav Uhlir; Frantisek Zboril; Frantisek Vidensky

During our participation in MAPC 2019, we have developed two multi-agent systems that have been designed specifically for this competition. The first of the systems is pro-active system that works with pre-specified scenarios and tasks agents with generated goals designed for individual agents according to assigned role. The second system is designed as more reactive and employs layered architecture

更新日期：2020-06-18
• arXiv.cs.MA Pub Date : 2020-06-16
Georgios Papoudakis; Filippos Christianos; Stefano V. Albrecht

Modelling the behaviours of other agents (opponents) is essential for understanding how agents interact and making effective decisions. Existing methods for opponent modelling commonly assume knowledge of the local observations and chosen actions of the modelled opponents, which can significantly limit their applicability. We propose a new modelling technique based on variational autoencoders which

更新日期：2020-06-18
• arXiv.cs.MA Pub Date : 2020-06-16
Mohamed Sana; Antonio De Domenico; Wei Yu; Yves Lostanlen; Emilio Calvanese Strinati

Network densification and millimeter-wave technologies are key enablers to fulfill the capacity and data rate requirements of the fifth generation (5G) of mobile networks. In this context, designing low-complexity policies with local observations, yet able to adapt the user association with respect to the global network state and to the network dynamics is a challenge. In fact, the frameworks proposed

更新日期：2020-06-16
• arXiv.cs.MA Pub Date : 2020-06-16
Kyle Brown; Oriana Peltzer; Martin A. Sehr; Mac Schwager; Mykel J. Kochenderfer

We study the problem of sequential task assignment and collision-free routing for large teams of robots in applications with inter-task precedence constraints (e.g., task $A$ and task $B$ must both be completed before task $C$ may begin). Such problems commonly occur in assembly planning for robotic manufacturing applications, in which sub-assemblies must be completed before they can be combined to

更新日期：2020-06-16
• arXiv.cs.MA Pub Date : 2020-06-15
Michael J. Curry; Ping-Yeh Chiang; Tom Goldstein; John Dickerson

Optimal auctions maximize a seller's expected revenue subject to individual rationality and strategyproofness for the buyers. Myerson's seminal work in 1981 settled the case of auctioning a single item; however, subsequent decades of work have yielded little progress moving beyond a single item, leaving the design of revenue-maximizing auctions as a central open problem in the field of mechanism design

更新日期：2020-06-15
• arXiv.cs.MA Pub Date : 2020-06-15
Stephen McAleer; John Lanier; Roy Fox; Pierre Baldi

Finding approximate Nash equilibria in zero-sum imperfect-information games is challenging when the number of information states is large. Policy Space Response Oracles (PSRO) is a deep reinforcement learning algorithm grounded in game theory that is guaranteed to converge to an approximate Nash equilibrium. However, PSRO requires training a reinforcement learning policy at each iteration, making it

更新日期：2020-06-15
• arXiv.cs.MA Pub Date : 2020-06-15
Raphaël BerthierPSL, SIERRA; Francis BachSIERRA, PSL; Pierre GaillardSIERRA, PSL

In the context of statistical supervised learning, the noiseless linear model assumes that there exists a deterministic linear relation $Y = \langle \theta_*, X \rangle$ between the random output $Y$ and the random feature vector $\Phi(U)$, a potentially non-linear transformation of the inputs $U$. We analyze the convergence of single-pass, fixed step-size stochastic gradient descent on the least-square

更新日期：2020-06-15
• arXiv.cs.MA Pub Date : 2020-06-14
Esmaeil Seraj; Matthew Gombolay

Fighting wildfires is a precarious task, imperiling the lives of engaging firefighters and those who reside in the fire's path. Firefighters need online and dynamic observation of the firefront to anticipate a wildfire's unknown characteristics, such as size, scale, and propagation velocity, and to plan accordingly. In this paper, we propose a distributed control framework to coordinate a team of unmanned

更新日期：2020-06-14
• arXiv.cs.MA Pub Date : 2020-06-14
Georgios Papoudakis; Filippos Christianos; Lukas Schäfer; Stefano V. Albrecht

Multi-agent deep reinforcement learning (MARL) suffers from a lack of commonly-used evaluation tasks and criteria, making comparisons between approaches difficult. In this work, we evaluate and compare three different classes of MARL algorithms (independent learners, centralised training with decentralised execution, and value decomposition) in a diverse range of multi-agent learning tasks. Our results

更新日期：2020-06-14
• arXiv.cs.MA Pub Date : 2020-06-12
Filippos Christianos; Lukas Schäfer; Stefano V. Albrecht

Exploration in multi-agent reinforcement learning is a challenging problem, especially in environments with sparse rewards. We propose a general method for efficient exploration by sharing experience amongst agents. Our proposed algorithm, called Shared Experience Actor-Critic (SEAC), applies experience sharing in an actor-critic framework. We evaluate SEAC in a collection of sparse-reward multi-agent

更新日期：2020-06-12
• arXiv.cs.MA Pub Date : 2020-06-12
Neda Navidi; Francois Chabot; Sagar Kurandwad; Irv Lustigman; Vincent Robert; Gregory Szriftgiser; Andrea Schuch

Collaborative multi-agent reinforcement learning (MARL) as a specific category of reinforcement learning provides effective results with agents learning from their observations, received rewards, and internal interactions between agents. However, centralized learning methods with a joint global policy in a highly dynamic environment present unique challenges in dealing with large amounts of information

更新日期：2020-06-12
• arXiv.cs.MA Pub Date : 2020-06-12
Mohammad Rasouli; Tao Sun; Ram Rajagopal

We propose Federated Generative Adversarial Network (FedGAN) for training a GAN across distributed sources of non-independent-and-identically-distributed data sources subject to communication and privacy constraints. Our algorithm uses local generators and discriminators which are periodically synced via an intermediary that averages and broadcasts the generator and discriminator parameters. We theoretically

更新日期：2020-06-12
• arXiv.cs.MA Pub Date : 2020-06-12
Simon Vanneste; Astrid Vanneste; Siegfried Mercelis; Peter Hellinckx

This paper introduces a new approach for multi-agent communication learning called multi-agent counterfactual communication (MACC) learning. Many real-world problems are currently tackled using multi-agent techniques. However, in many of these tasks the agents do not observe the full state of the environment but only a limited observation. This absence of knowledge about the full state makes completing

更新日期：2020-06-12
• arXiv.cs.MA Pub Date : 2020-06-12
Mina Sedaghat; Pontus Sköldström; Daniell Turull; Vinay Yadhav; Joacim Halén; Madhubala Ganesan; Amardeep Mehta; Wolfgang John

Virtualization, either at OS- or hardware level, plays an important role in cloud computing. It enables easier automation and faster deployment in distributed environments. While virtualized infrastructures provide a level of management flexibility, they lack practical abstraction of the distributed resources. A developer in such an environment still needs to deal with all the complications of building

更新日期：2020-06-12
• arXiv.cs.MA Pub Date : 2020-06-11
Justin K Terry; Nathaniel Grammel

We introduce a new mathematical model of multi-agent reinforcement learning,the Multi-Agent Informational Learning Process or "MAILP" model. The model is based on the notion that agents have policies for a certain amount of information, models how this information iteratively evolves and propagates through manyagents. This model is very general, and the only meaningful assumption made is that learning

更新日期：2020-06-11
• arXiv.cs.MA Pub Date : 2020-06-11

We provide a brief description of the GOAL-DTU system for the agent contest, including the overall strategy and how the system is designed to apply this strategy. Our agents are implemented using the GOAL programming language. We evaluate the performance of our agents for the contest, and finally also discuss how to improve the system based on analysis of its strengths and weaknesses.

更新日期：2020-06-11
• arXiv.cs.MA Pub Date : 2020-06-11
Lukas Breitwieser; Ahmad Hesam; Jean de Montigny; Vasileios Vavourakis; Alexandros Iosif; Jack Jennings; Marcus Kaiser; Marco Manca; Alberto Di Meglio; Zaid Al-Ars; Fons Rademakers; Onur Mutlu; Roman Bauer

Computer simulation is an indispensable tool for studying complex biological systems. In particular, agent-based modeling is an attractive method to describe biophysical dynamics. However, two barriers limit faster progress. First, simulators do not always take full advantage of parallel and heterogeneous hardware. Second, many agent-based simulators are written with a specific research problem in

更新日期：2020-06-11
• arXiv.cs.MA Pub Date : 2020-06-11
Ziluo Ding; Tiejun Huang; Zongqing Lu

Communication lays the foundation for human cooperation. It is also crucial for multi-agent cooperation. However, existing work focuses on broadcast communication, which is not only impractical but also leads to information redundancy that could even impair the learning process. To tackle these difficulties, we propose \textit{Individually Inferred Communication} (I2C), a simple yet effective model

更新日期：2020-06-11
• arXiv.cs.MA Pub Date : 2020-06-11
Guannan Qu; Yiheng Lin; Adam Wierman; Na Li

It has long been recognized that multi-agent reinforcement learning (MARL) faces significant scalability issues due to the fact that the size of the state and action spaces are exponentially large in the number of agents. In this paper, we identify a rich class of networked MARL problems where the model exhibits a local dependence structure that allows it to be solved in a scalable manner. Specifically

更新日期：2020-06-11
• arXiv.cs.MA Pub Date : 2020-06-10
Jobst Heitzig; Forest W. Simmons

Are there voting methods which (i) give everyone, including minorities, an equal share of effective power even if voters act strategically, (ii) promote consensus rather than polarization and inequality, and (iii) do not favour the status quo or rely too much on chance? We show the answer is yes by describing two nondeterministic voting methods, one based on automatic bargaining over lotteries, the

更新日期：2020-06-10
Contents have been reproduced by permission of the publishers.

down
wechat
bug