当前期刊: arXiv - CS - Artificial Intelligence Go to current issue    加入关注   
显示样式:        排序: 导出
我的关注
我的收藏
您暂时未登录!
登录
  • Algorithms in Multi-Agent Systems: A Holistic Perspective from Reinforcement Learning and Game Theory
    arXiv.cs.AI Pub Date : 2020-01-17
    Yunlong Lu; Kai Yan

    Deep reinforcement learning (RL) has achieved outstanding results in recent years, which has led a dramatic increase in the number of methods and applications. Recent works are exploring learning beyond single-agent scenarios and considering multi-agent scenarios. However, they are faced with lots of challenges and are seeking for help from traditional game-theoretic algorithms, which, in turn, show bright application promise combined with modern algorithms and boosting computing power. In this survey, we first introduce basic concepts and algorithms in single agent RL and multi-agent systems; then, we summarize the related algorithms from three aspects. Solution concepts from game theory give inspiration to algorithms which try to evaluate the agents or find better solutions in multi-agent systems. Fictitious self-play becomes popular and has a great impact on the algorithm of multi-agent reinforcement learning. Counterfactual regret minimization is an important tool to solve games with incomplete information, and has shown great strength when combined with deep learning.

    更新日期:2020-01-22
  • Activism by the AI Community: Analysing Recent Achievements and Future Prospects
    arXiv.cs.AI Pub Date : 2020-01-17
    Haydn Belfield

    The artificial intelligence community (AI) has recently engaged in activism in relation to their employers, other members of the community, and their governments in order to shape the societal and ethical implications of AI. It has achieved some notable successes, but prospects for further political organising and activism are uncertain. We survey activism by the AI community over the last six years; apply two analytical frameworks drawing upon the literature on epistemic communities, and worker organising and bargaining; and explore what they imply for the future prospects of the AI community. Success thus far has hinged on a coherent shared culture, and high bargaining power due to the high demand for a limited supply of AI talent. Both are crucial to the future of AI activism and worthy of sustained attention.

    更新日期:2020-01-22
  • The Risk to Population Health Equity Posed by Automated Decision Systems: A Narrative Review
    arXiv.cs.AI Pub Date : 2020-01-18
    Mitchell Burger

    Artificial intelligence is already ubiquitous, and is increasingly being used to autonomously make ever more consequential decisions. However, there has been relatively little research into the consequences for equity of the use of narrow AI and automated decision systems in medicine and public health. A narrative review using a hermeneutic approach was undertaken to explore current and future uses of AI in medicine and public health, issues that have emerged, and longer-term implications for population health. Accounts in the literature reveal a tremendous expectation on AI to transform medical and public health practices, especially regarding precision medicine and precision public health. Automated decisions being made about disease detection, diagnosis, treatment, and health funding allocation have significant consequences for individual and population health and wellbeing. Meanwhile, it is evident that issues of bias, incontestability, and erosion of privacy have emerged in sensitive domains where narrow AI and automated decision systems are in common use. As the use of automated decision systems expands, it is probable that these same issues will manifest widely in medicine and public health applications. Bias, incontestability, and erosion of privacy are mechanisms by which existing social, economic and health disparities are perpetuated and amplified. The implication is that there is a significant risk that use of automated decision systems in health will exacerbate existing population health inequities. The industrial scale and rapidity with which automated decision systems can be applied to whole populations heightens the risk to population health equity. There is a need therefore to design and implement automated decision systems with care, monitor their impact over time, and develop capacities to respond to issues as they emerge.

    更新日期:2020-01-22
  • Multi-agent Motion Planning for Dense and Dynamic Environments via Deep Reinforcement Learning
    arXiv.cs.AI Pub Date : 2020-01-18
    Samaneh Hosseini Semnani; Hugh Liu; Michael Everett; Anton de Ruiter; Jonathan P. How

    This paper introduces a hybrid algorithm of deep reinforcement learning (RL) and Force-based motion planning (FMP) to solve distributed motion planning problem in dense and dynamic environments. Individually, RL and FMP algorithms each have their own limitations. FMP is not able to produce time-optimal paths and existing RL solutions are not able to produce collision-free paths in dense environments. Therefore, we first tried improving the performance of recent RL approaches by introducing a new reward function that not only eliminates the requirement of a pre supervised learning (SL) step but also decreases the chance of collision in crowded environments. That improved things, but there were still a lot of failure cases. So, we developed a hybrid approach to leverage the simpler FMP approach in stuck, simple and high-risk cases, and continue using RL for normal cases in which FMP can't produce optimal path. Also, we extend GA3C-CADRL algorithm to 3D environment. Simulation results show that the proposed algorithm outperforms both deep RL and FMP algorithms and produces up to 50% more successful scenarios than deep RL and up to 75% less extra time to reach goal than FMP.

    更新日期:2020-01-22
  • Graph Ordering: Towards the Optimal by Learning
    arXiv.cs.AI Pub Date : 2020-01-18
    Kangfei Zhao; Yu Rong; Jeffrey Xu Yu; Junzhou Huang; Hao Zhang

    Graph representation learning has achieved a remarkable success in many graph-based applications, such as node classification, link prediction, and community detection. These models are usually designed to preserve the vertex information at different granularity and reduce the problems in discrete space to some machine learning tasks in continuous space. However, regardless of the fruitful progress, for some kind of graph applications, such as graph compression and edge partition, it is very hard to reduce them to some graph representation learning tasks. Moreover, these problems are closely related to reformulating a global layout for a specific graph, which is an important NP-hard combinatorial optimization problem: graph ordering. In this paper, we propose to attack the graph ordering problem behind such applications by a novel learning approach. Distinguished from greedy algorithms based on predefined heuristics, we propose a neural network model: Deep Order Network (DON) to capture the hidden locality structure from partial vertex order sets. Supervised by sampled partial order, DON has the ability to infer unseen combinations. Furthermore, to alleviate the combinatorial explosion in the training space of DON and make the efficient partial vertex order sampling , we employ a reinforcement learning model: the Policy Network, to adjust the partial order sampling probabilities during the training phase of DON automatically. To this end, the Policy Network can improve the training efficiency and guide DON to evolve towards a more effective model automatically. Comprehensive experiments on both synthetic and real data validate that DON-RL outperforms the current state-of-the-art heuristic algorithm consistently. Two case studies on graph compression and edge partitioning demonstrate the potential power of DON-RL in real applications.

    更新日期:2020-01-22
  • Learning to See Analogies: A Connectionist Exploration
    arXiv.cs.AI Pub Date : 2020-01-18
    Douglas S. Blank

    This dissertation explores the integration of learning and analogy-making through the development of a computer program, called Analogator, that learns to make analogies by example. By "seeing" many different analogy problems, along with possible solutions, Analogator gradually develops an ability to make new analogies. That is, it learns to make analogies by analogy. This approach stands in contrast to most existing research on analogy-making, in which typically the a priori existence of analogical mechanisms within a model is assumed. The present research extends standard connectionist methodologies by developing a specialized associative training procedure for a recurrent network architecture. The network is trained to divide input scenes (or situations) into appropriate figure and ground components. Seeing one scene in terms of a particular figure and ground provides the context for seeing another in an analogous fashion. After training, the model is able to make new analogies between novel situations. Analogator has much in common with lower-level perceptual models of categorization and recognition; it thus serves as a unifying framework encompassing both high-level analogical learning and low-level perception. This approach is compared and contrasted with other computational models of analogy-making. The model's training and generalization performance is examined, and limitations are discussed.

    更新日期:2020-01-22
  • How do Data Science Workers Collaborate? Roles, Workflows, and Tools
    arXiv.cs.AI Pub Date : 2020-01-18
    Amy X. Zhang; Michael Muller; Dakuo Wang

    Today, the prominence of data science within organizations has given rise to teams of data science workers collaborating on extracting insights from data, as opposed to individual data scientists working alone. However, we still lack a deep understanding of how data science workers collaborate in practice. In this work, we conducted an online survey with 183 participants who work in various aspects of data science. We focused on their reported interactions with each other (e.g., managers with engineers) and with different tools (e.g., Jupyter Notebook). We found that data science teams are extremely collaborative and work with a variety of stakeholders and tools during the six common steps of a data science workflow (e.g., clean data and train model). We also found that the collaborative practices workers employ, such as documentation, vary according to the kinds of tools they use. Based on these findings, we discuss design implications for supporting data science team collaborations and future research directions.

    更新日期:2020-01-22
  • Teaching Software Engineering for AI-Enabled Systems
    arXiv.cs.AI Pub Date : 2020-01-18
    Christian Kästner; Eunsuk Kang

    Software engineers have significant expertise to offer when building intelligent systems, drawing on decades of experience and methods for building systems that are scalable, responsive and robust, even when built on unreliable components. Systems with artificial-intelligence or machine-learning (ML) components raise new challenges and require careful engineering. We designed a new course to teach software-engineering skills to students with a background in ML. We specifically go beyond traditional ML courses that teach modeling techniques under artificial conditions and focus, in lecture and assignments, on realism with large and changing datasets, robust and evolvable infrastructure, and purposeful requirements engineering that considers ethics and fairness as well. We describe the course and our infrastructure and share experience and all material from teaching the course for the first time.

    更新日期:2020-01-22
  • Fair Transfer of Multiple Style Attributes in Text
    arXiv.cs.AI Pub Date : 2020-01-18
    Karan Dabas; Nishtha Madan; Vijay Arya; Sameep Mehta; Gautam Singh; Tanmoy Chakraborty

    To preserve anonymity and obfuscate their identity on online platforms users may morph their text and portray themselves as a different gender or demographic. Similarly, a chatbot may need to customize its communication style to improve engagement with its audience. This manner of changing the style of written text has gained significant attention in recent years. Yet these past research works largely cater to the transfer of single style attributes. The disadvantage of focusing on a single style alone is that this often results in target text where other existing style attributes behave unpredictably or are unfairly dominated by the new style. To counteract this behavior, it would be nice to have a style transfer mechanism that can transfer or control multiple styles simultaneously and fairly. Through such an approach, one could obtain obfuscated or written text incorporated with a desired degree of multiple soft styles such as female-quality, politeness, or formalness. In this work, we demonstrate that the transfer of multiple styles cannot be achieved by sequentially performing multiple single-style transfers. This is because each single style-transfer step often reverses or dominates over the style incorporated by a previous transfer step. We then propose a neural network architecture for fairly transferring multiple style attributes in a given text. We test our architecture on the Yelp data set to demonstrate our superior performance as compared to existing one-style transfer steps performed in a sequence.

    更新日期:2020-01-22
  • FRESH: Interactive Reward Shaping in High-Dimensional State Spaces using Human Feedback
    arXiv.cs.AI Pub Date : 2020-01-19
    Baicen Xiao; Qifan Lu; Bhaskar Ramasubramanian; Andrew Clark; Linda Bushnell; Radha Poovendran

    Reinforcement learning has been successful in training autonomous agents to accomplish goals in complex environments. Although this has been adapted to multiple settings, including robotics and computer games, human players often find it easier to obtain higher rewards in some environments than reinforcement learning algorithms. This is especially true of high-dimensional state spaces where the reward obtained by the agent is sparse or extremely delayed. In this paper, we seek to effectively integrate feedback signals supplied by a human operator with deep reinforcement learning algorithms in high-dimensional state spaces. We call this FRESH (Feedback-based REward SHaping). During training, a human operator is presented with trajectories from a replay buffer and then provides feedback on states and actions in the trajectory. In order to generalize feedback signals provided by the human operator to previously unseen states and actions at test-time, we use a feedback neural network. We use an ensemble of neural networks with a shared network architecture to represent model uncertainty and the confidence of the neural network in its output. The output of the feedback neural network is converted to a shaping reward that is augmented to the reward provided by the environment. We evaluate our approach on the Bowling and Skiing Atari games in the arcade learning environment. Although human experts have been able to achieve high scores in these environments, state-of-the-art deep learning algorithms perform poorly. We observe that FRESH is able to achieve much higher scores than state-of-the-art deep learning algorithms in both environments. FRESH also achieves a 21.4% higher score than a human expert in Bowling and does as well as a human expert in Skiing.

    更新日期:2020-01-22
  • Correcting Knowledge Base Assertions
    arXiv.cs.AI Pub Date : 2020-01-19
    Jiaoyan Chen; Xi Chen; Ian Horrocks; Ernesto Jimenez-Ruiz; Erik B. Myklebus

    The usefulness and usability of knowledge bases (KBs) is often limited by quality issues. One common issue is the presence of erroneous assertions, often caused by lexical or semantic confusion. We study the problem of correcting such assertions, and present a general correction framework which combines lexical matching, semantic embedding, soft constraint mining and semantic consistency checking. The framework is evaluated using DBpedia and an enterprise medical KB.

    更新日期:2020-01-22
  • A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions
    arXiv.cs.AI Pub Date : 2020-01-19
    Amit Kumar Mondal; Nadeem Jamali

    Reinforcement learning is one of the core components in designing an artificial intelligent system emphasizing real-time response. Reinforcement learning influences the system to take actions within an arbitrary environment either having previous knowledge about the environment model or not. In this paper, we present a comprehensive study on Reinforcement Learning focusing on various dimensions including challenges, the recent development of different state-of-the-art techniques, and future directions. The fundamental objective of this paper is to provide a framework for the presentation of available methods of reinforcement learning that is informative enough and simple to follow for the new researchers and academics in this domain considering the latest concerns. First, we illustrated the core techniques of reinforcement learning in an easily understandable and comparable way. Finally, we analyzed and depicted the recent developments in reinforcement learning approaches. My analysis pointed out that most of the models focused on tuning policy values rather than tuning other things in a particular state of reasoning.

    更新日期:2020-01-22
  • SQuINTing at VQA Models: Interrogating VQA Models with Sub-Questions
    arXiv.cs.AI Pub Date : 2020-01-20
    Ramprasaath R. Selvaraju; Purva Tendulkar; Devi Parikh; Eric Horvitz; Marco Ribeiro; Besmira Nushi; Ece Kamar

    Existing VQA datasets contain questions with varying levels of complexity. While the majority of questions in these datasets require perception for recognizing existence, properties, and spatial relationships of entities, a significant portion of questions pose challenges that correspond to reasoning tasks -- tasks that can only be answered through a synthesis of perception and knowledge about the world, logic and / or reasoning. This distinction allows us to notice when existing VQA models have consistency issues -- they answer the reasoning question correctly but fail on associated low-level perception questions. For example, models answer the complex reasoning question "Is the banana ripe enough to eat?" correctly, but fail on the associated perception question "Are the bananas mostly green or yellow?" indicating that the model likely answered the reasoning question correctly but for the wrong reason. We quantify the extent to which this phenomenon occurs by creating a new Reasoning split of the VQA dataset and collecting Sub-VQA, a new dataset consisting of 200K new perception questions which serve as sub questions corresponding to the set of perceptual tasks needed to effectively answer the complex reasoning questions in the Reasoning split. Additionally, we propose an approach called Sub-Question Importance-aware Network Tuning (SQuINT), which encourages the model to attend do the same parts of the image when answering the reasoning question and the perception sub questions. We show that SQuINT improves model consistency by 7.8%, also marginally improving its performance on the Reasoning questions in VQA, while also displaying qualitatively better attention maps.

    更新日期:2020-01-22
  • MOEA/D with Random Partial Update Strategy
    arXiv.cs.AI Pub Date : 2020-01-20
    Yuri Lavinas; Claus Aranha; Marcelo Ladeira; Felipe Campelo

    Recent studies on resource allocation suggest that some subproblems are more important than others in the context of the MOEA/D, and that focusing on the most relevant ones can consistently improve the performance of that algorithm. These studies share the common characteristic of updating only a fraction of the population at any given iteration of the algorithm. In this work we investigate a new, simpler partial update strategy, in which a random subset of solutions is selected at every iteration. The performance of the MOEA/D using this new resource allocation approach is compared experimentally against that of the standard MOEA/D-DE and the MOEA/D with relative improvement-based resource allocation. The results indicate that using the MOEA/D with this new partial update strategy results in improved HV and IGD values, and a much higher proportion of non-dominated solutions, particularly as the number of updated solutions at every iteration is reduced.

    更新日期:2020-01-22
  • A point-wise linear model reveals reasons for 30-day readmission of heart failure patients
    arXiv.cs.AI Pub Date : 2020-01-20
    Yasuho Yamashita; Takuma Shibahara; Junichi Kuwata

    Heart failures in the United States cost an estimated 30.7 billion dollars annually and predictive analysis can decrease costs due to readmission of heart failure patients. Deep learning can predict readmissions but does not give reasons for its predictions. Ours is the first study on a deep-learning approach to explaining decisions behind readmission predictions. Additionally, it provides an automatic patient stratification to explain cohorts of readmitted patients. The new deep-learning model called a point-wise linear model is a meta-learning machine of linear models. It generates a logistic regression model to predict early readmission for each patient. The custom-made prediction models allow us to analyze feature importance. We evaluated the approach using a dataset that had 30-days readmission patients with heart failures. This study has been submitted in PLOS ONE. In advance, we would like to share the theoretical aspect of the point-wise linear model as a part of our study.

    更新日期:2020-01-22
  • Measuring Diversity of Artificial Intelligence Conferences
    arXiv.cs.AI Pub Date : 2020-01-20
    Ana Freire; Lorenzo Porcaro; Emilia Gómez

    The lack of diversity of the Artificial Intelligence (AI) field is nowadays a concern, and several initiatives such as funding schemes and mentoring programs have been designed to fight against it. However, there is no indication on how these initiatives actually impact AI diversity in the short and long term. This work studies the concept of diversity in this particular context and proposes a small set of diversity indicators (i.e. indexes) of AI scientific events. These indicators are designed to quantify the lack of diversity of the AI field and monitor its evolution. We consider diversity in terms of gender, geographical location and business (understood as the presence of academia versus industry). We compute these indicators for the different communities of a conference: authors, keynote speakers and organizing committee. From these components we compute a summarized diversity indicator for each AI event. We evaluate the proposed indexes for a set of recent major AI conferences and we discuss their values and limitations.

    更新日期:2020-01-22
  • Synergizing Domain Expertise with Self-Awareness in Software Systems: A Patternized Architecture Guideline
    arXiv.cs.AI Pub Date : 2020-01-20
    Tao Chen; Rami Bahsoon; Xin Yao

    Architectural patterns provide a reusable architectural solution for commonly recurring problems that can assist in designing software systems. In this regard, self-awareness architectural patterns are specialized patterns that leverage good engineering practices and experiences to help in designing self-awareness and self-adaptation of a software system. However, domain knowledge and engineers' expertise that is built over time are not explicitly linked to these patterns and the self-aware process. This linkage is important, as it can enrich the design patterns of these systems, which consequently leads to more effective and efficient self-aware and self-adaptive behaviours. This paper is an introductory work that highlights the importance of synergizing domain expertise into the self-awareness in software systems, relying on well-defined underlying approaches. In particular, we present a holistic framework that classifies widely known representations used to obtain and maintain the domain expertise, documenting their nature and specifics rules that permits different levels of synergies with self-awareness. Drawing on such, we describe mechanisms that can enrich existing patterns with engineers' expertise and knowledge of the domain. This, together with the framework, allow us to codify an intuitive step-by-step methodology that guides engineer in making design decisions when synergizing domain expertise into self-awareness and reveal their importances, in an attempt to keep 'engineers-in-the-loop'. Through three case studies, we demonstrate how the enriched patterns, the proposed framework and methodology can be applied in different domains, within which we quantitatively compare the actual benefits of incorporating engineers' expertise into self-awareness, at alternative levels of synergies.

    更新日期:2020-01-22
  • The Incentives that Shape Behaviour
    arXiv.cs.AI Pub Date : 2020-01-20
    Ryan Carey; Eric Langlois; Tom Everitt; Shane Legg

    Which variables does an agent have an incentive to control with its decision, and which variables does it have an incentive to respond to? We formalise these incentives, and demonstrate unique graphical criteria for detecting them in any single decision causal influence diagram. To this end, we introduce structural causal influence models, a hybrid of the influence diagram and structural causal model frameworks. Finally, we illustrate how these incentives predict agent incentives in both fairness and AI safety applications.

    更新日期:2020-01-22
  • An interpretable neural network model through piecewise linear approximation
    arXiv.cs.AI Pub Date : 2020-01-20
    Mengzhuo Guo; Qingpeng Zhang; Xiuwu Liao; Daniel Dajun Zeng

    Most existing interpretable methods explain a black-box model in a post-hoc manner, which uses simpler models or data analysis techniques to interpret the predictions after the model is learned. However, they (a) may derive contradictory explanations on the same predictions given different methods and data samples, and (b) focus on using simpler models to provide higher descriptive accuracy at the sacrifice of prediction accuracy. To address these issues, we propose a hybrid interpretable model that combines a piecewise linear component and a nonlinear component. The first component describes the explicit feature contributions by piecewise linear approximation to increase the expressiveness of the model. The other component uses a multi-layer perceptron to capture feature interactions and implicit nonlinearity, and increase the prediction performance. Different from the post-hoc approaches, the interpretability is obtained once the model is learned in the form of feature shapes. We also provide a variant to explore higher-order interactions among features to demonstrate that the proposed model is flexible for adaptation. Experiments demonstrate that the proposed model can achieve good interpretability by describing feature shapes while maintaining state-of-the-art accuracy.

    更新日期:2020-01-22
  • Dynamic Epistemic Logic Games with Epistemic Temporal Goals
    arXiv.cs.AI Pub Date : 2020-01-20
    Bastien Maubert; Aniello Murano; Sophie Pinchinat; François Schwarzentruber; Silvia Stranieri

    Dynamic Epistemic Logic (DEL) is a logical framework in which one can describe in great detail how actions are perceived by the agents, and how they affect the world. DEL games were recently introduced as a way to define classes of games with imperfect information where the actions available to the players are described very precisely. This framework makes it possible to define easily, for instance, classes of games where players can only use public actions or public announcements. These games have been studied for reachability objectives, where the aim is to reach a situation satisfying some epistemic property expressed in epistemic logic; several (un)decidability results have been established. In this work we show that the decidability results obtained for reachability objectives extend to a much more general class of winning conditions, namely those expressible in the epistemic temporal logic LTLK. To do so we establish that the infinite game structures generated by DEL public actions are regular, and we describe how to obtain finite representations on which we rely to solve them.

    更新日期:2020-01-22
  • Towards Social Identity in Socio-Cognitive Agents
    arXiv.cs.AI Pub Date : 2020-01-20
    Diogo Rato; Samuel Mascarenhas; Rui Prada

    Current architectures for social agents are designed around some specific units of social behaviour that address particular challenges. Although their performance might be adequate for controlled environments, deploying these agents in the wild is difficult. Moreover, the increasing demand for autonomous agents capable of living alongside humans calls for the design of more robust social agents that can cope with diverse social situations. We believe that to design such agents, their sociality and cognition should be conceived as one. This includes creating mechanisms for constructing social reality as an interpretation of the physical world with social meanings and selective deployment of cognitive resources adequate to the situation. We identify several design principles that should be considered while designing agent architectures for socio-cognitive systems. Taking these remarks into account, we propose a socio-cognitive agent model based on the concept of Cognitive Social Frames that allow the adaptation of an agent's cognition based on its interpretation of its surroundings, its Social Context. Our approach supports an agent's reasoning about other social actors and its relationship with them. Cognitive Social Frames can be built around social groups, and form the basis for social group dynamics mechanisms and construct of Social Identity.

    更新日期:2020-01-22
  • AutoMATES: Automated Model Assembly from Text, Equations, and Software
    arXiv.cs.AI Pub Date : 2020-01-21
    Adarsh Pyarelal; Marco A. Valenzuela-Escarcega; Rebecca Sharp; Paul D. Hein; Jon Stephens; Pratik Bhandari; HeuiChan Lim; Saumya Debray; Clayton T. Morrison

    Models of complicated systems can be represented in different ways - in scientific papers, they are represented using natural language text as well as equations. But to be of real use, they must also be implemented as software, thus making code a third form of representing models. We introduce the AutoMATES project, which aims to build semantically-rich unified representations of models from scientific code and publications to facilitate the integration of computational models from different domains and allow for modeling large, complicated systems that span multiple domains and levels of abstraction.

    更新日期:2020-01-22
  • Sampling and Learning for Boolean Function
    arXiv.cs.AI Pub Date : 2020-01-21
    Chuyu Xiong

    In this article, we continue our study on universal learning machine by introducing new tools. We first discuss boolean function and boolean circuit, and we establish one set of tools, namely, fitting extremum and proper sampling set. We proved the fundamental relationship between proper sampling set and complexity of boolean circuit. Armed with this set of tools, we then introduce much more effective learning strategies. We show that with such learning strategies and learning dynamics, universal learning can be achieved, and requires much less data.

    更新日期:2020-01-22
  • Lyceum: An efficient and scalable ecosystem for robot learning
    arXiv.cs.AI Pub Date : 2020-01-21
    Colin Summers; Kendall Lowrey; Aravind Rajeswaran; Siddhartha Srinivasa; Emanuel Todorov

    We introduce Lyceum, a high-performance computational ecosystem for robot learning. Lyceum is built on top of the Julia programming language and the MuJoCo physics simulator, combining the ease-of-use of a high-level programming language with the performance of native C. In addition, Lyceum has a straightforward API to support parallel computation across multiple cores and machines. Overall, depending on the complexity of the environment, Lyceum is 5-30x faster compared to other popular abstractions like OpenAI's Gym and DeepMind's dm-control. This substantially reduces training time for various reinforcement learning algorithms; and is also fast enough to support real-time model predictive control through MuJoCo. The code, tutorials, and demonstration videos can be found at: www.lyceum.ml.

    更新日期:2020-01-22
  • On Algorithmic Decision Procedures in Emergency Response Systems in Smart and Connected Communities
    arXiv.cs.AI Pub Date : 2020-01-21
    Geoffrey Pettet; Ayan Mukhopadhyay; Mykel Kochenderfer; Yevgeniy Vorobeychik; Abhishek Dubey

    Emergency Response Management (ERM) is a critical problem faced by communities across the globe. Despite its importance, it is common for ERM systems to follow myopic and straight-forward decision policies in the real world. Principled approaches to aid decision-making under uncertainty have been explored in this context but have failed to be accepted into real systems. We identify a key issue impeding their adoption - algorithmic approaches to emergency response focus on reactive, post-incident dispatching actions, i.e. optimally dispatching a responder after incidents occur. However, the critical nature of emergency response dictates that when an incident occurs, first responders always dispatch the closest available responder to the incident. We argue that the crucial period of planning for ERM systems is not post-incident, but between incidents. However, this is not a trivial planning problem - a major challenge with dynamically balancing the spatial distribution of responders is the complexity of the problem. An orthogonal problem in ERM systems is to plan under limited communication, which is particularly important in disaster scenarios that affect communication networks. We address both the problems by proposing two partially decentralized multi-agent planning algorithms that utilize heuristics and the structure of the dispatch problem. We evaluate our proposed approach using real-world data, and find that in several contexts, dynamic re-balancing the spatial distribution of emergency responders reduces both the average response time as well as its variance.

    更新日期:2020-01-22
  • A multi-agent ontologies-based clinical decision support system
    arXiv.cs.AI Pub Date : 2020-01-21
    Ying ShenUPN; Jacquet-Andrieu ArmelleIDEES; Joël CollocIDEES

    Clinical decision support systems combine knowledge and data from a variety of sources, represented by quantitative models based on stochastic methods, or qualitative based rather on expert heuristics and deductive reasoning. At the same time, case-based reasoning (CBR) memorizes and returns the experience of solving similar problems. The cooperation of heterogeneous clinical knowledge bases (knowledge objects, semantic distances, evaluation functions, logical rules, databases...) is based on medical ontologies. A multi-agent decision support system (MADSS) enables the integration and cooperation of agents specialized in different fields of knowledge (semiology, pharmacology, clinical cases, etc.). Each specialist agent operates a knowledge base defining the conduct to be maintained in conformity with the state of the art associated with an ontological basis that expresses the semantic relationships between the terms of the domain in question. Our approach is based on the specialization of agents adapted to the knowledge models used during the clinical steps and ontologies. This modular approach is suitable for the realization of MADSS in many areas.

    更新日期:2020-01-22
  • Explaining Data-Driven Decisions made by AI Systems: The Counterfactual Approach
    arXiv.cs.AI Pub Date : 2020-01-21
    Carlos Fernandez; Foster Provost; Xintian Han

    Lack of understanding of the decisions made by model-based AI systems is an important barrier for their adoption. We examine counterfactual explanations as an alternative for explaining AI decisions. The counterfactual approach defines an explanation as a set of the system's data inputs that causally drives the decision (meaning that removing them changes the decision) and is irreducible (meaning that removing any subset of the inputs in the explanation does not change the decision). We generalize previous work on counterfactual explanations, resulting in a framework that (a) is model-agnostic, (b) can address features with arbitrary data types, (c) is able explain decisions made by complex AI systems that incorporate multiple models, and (d) is scalable to large numbers of features. We also propose a heuristic procedure to find the most useful explanations depending on the context. We contrast counterfactual explanations with another alternative: methods that explain model predictions by weighting features according to their importance (e.g., SHAP, LIME). This paper presents two fundamental reasons why explaining model predictions is not the same as explaining the decisions made using those predictions, suggesting we should carefully consider whether importance-weight explanations are well-suited to explain decisions made by AI systems. Specifically, we show that (1) features that have a large importance weight for a model prediction may not actually affect the corresponding decision, and (2) importance weights are insufficient to communicate whether and how features influence system decisions. We demonstrate this using several examples, including three detailed studies using real-world data that compare the counterfactual approach with SHAP and illustrate various conditions under which counterfactual explanations explain data-driven decisions better than feature importance weights.

    更新日期:2020-01-22
  • Designing for the Long Tail of Machine Learning
    arXiv.cs.AI Pub Date : 2020-01-21
    Martin Lindvall; Jesper Molin

    Recent technical advances has made machine learning (ML) a promising component to include in end user facing systems. However, user experience (UX) practitioners face challenges in relating ML to existing user-centered design processes and how to navigate the possibilities and constraints of this design space. Drawing on our own experience, we characterize designing within this space as navigating trade-offs between data gathering, model development and designing valuable interactions for a given model performance. We suggest that the theoretical description of how machine learning performance scales with training data can guide designers in these trade-offs as well as having implications for prototyping. We exemplify the learning curve's usage by arguing that a useful pattern is to design an initial system in a bootstrap phase that aims to exploit the training effect of data collected at increasing orders of magnitude.

    更新日期:2020-01-22
  • Unsupervisedly Learned Representations: Should the Quest be Over?
    arXiv.cs.AI Pub Date : 2020-01-21
    Daniel N. NissaniNissensohn

    There exists a Classification accuracy gap of about 20% between our best methods of generating Unsupervisedly Learned Representations and the accuracy rates achieved by (naturally Unsupervisedly Learning) humans. We are at our fourth decade at least in search of this class of paradigms. It thus may well be that we are looking in the wrong direction. We present in this paper a possible solution to this puzzle. We demonstrate that Reinforcement Learning schemes can learn representations, which may be used for Pattern Recognition tasks such as Classification, achieving practically the same accuracy as that of humans. Our main modest contribution lies in the observations that: a. when applied to a real world environment (e.g. nature itself) Reinforcement Learning does not require labels, and thus may be considered a natural candidate for the long sought, accuracy competitive Unsupervised Learning method, and b. in contrast, when Reinforcement Learning is applied in a simulated or symbolic processing environment (e.g. a computer program) it does inherently require labels and should thus be generally classified, with some exceptions, as Supervised Learning. The corollary of these observations is that further search for Unsupervised Learning competitive paradigms which may be trained in simulated environments like many of those found in research and applications may be futile.

    更新日期:2020-01-22
  • Combining Federated and Active Learning for Communication-efficient Distributed Failure Prediction in Aeronautics
    arXiv.cs.AI Pub Date : 2020-01-21
    Nicolas AusselINF, ACMES-SAMOVAR, IP Paris; Sophie ChabridonIP Paris, INF, ACMES-SAMOVAR; Yohan PetetinTIPIC-SAMOVAR, CITI, IP Paris

    Machine Learning has proven useful in the recent years as a way to achieve failure prediction for industrial systems. However, the high computational resources necessary to run learning algorithms are an obstacle to its widespread application. The sub-field of Distributed Learning offers a solution to this problem by enabling the use of remote resources but at the expense of introducing communication costs in the application that are not always acceptable. In this paper, we propose a distributed learning approach able to optimize the use of computational and communication resources to achieve excellent learning model performances through a centralized architecture. To achieve this, we present a new centralized distributed learning algorithm that relies on the learning paradigms of Active Learning and Federated Learning to offer a communication-efficient method that offers guarantees of model precision on both the clients and the central server. We evaluate this method on a public benchmark and show that its performances in terms of precision are very close to state-of-the-art performance level of non-distributed learning despite additional constraints.

    更新日期:2020-01-22
  • Engineering AI Systems: A Research Agenda
    arXiv.cs.AI Pub Date : 2020-01-16
    Jan Bosch; Ivica Crnkovic; Helena Holmström Olsson

    Deploying machine-, and in particular deep-learning, (ML/DL) solutions in industry-strength, production quality contexts proves to challenging. This requires a structured engineering approach to constructing and evolving systems that contain ML/DL components. In this paper, we provide a conceptualization of the typical evolution patterns that companies experience when employing ML/DL well as a framework for integrating ML/DL components in systems consisting of multiple types of components. In addition, we provide an overview of the engineering challenges surrounding AI/ML/DL solutions and, based on that, we provide a research agenda and overview of open items that need to be addressed by the research community at large.

    更新日期:2020-01-22
  • Node Masking: Making Graph Neural Networks Generalize and Scale Better
    arXiv.cs.AI Pub Date : 2020-01-17
    Pushkar Mishra; Aleksandra Piktus; Gerard Goossen; Fabrizio Silvestri

    Graph Neural Networks (GNNs) have received a lot of interest in the recent times. From the early spectral architectures that could only operate on undirected graphs per a transductive learning paradigm to the current state of the art spatial ones that can apply inductively to arbitrary graphs, GNNs have seen significant contributions from the research community. In this paper, we discuss some theoretical tools to better visualize the operations performed by state of the art spatial GNNs. We analyze the inner workings of these architectures and introduce a simple concept, node masking, that allows them to generalize and scale better. To empirically validate the theory, we perform several experiments on two widely used benchmark datasets for node classification in both transductive and inductive settings.

    更新日期:2020-01-22
  • Domain-Aware Dialogue State Tracker for Multi-Domain Dialogue Systems
    arXiv.cs.AI Pub Date : 2020-01-21
    Vevake Balaraman; Bernardo Magnini

    In task-oriented dialogue systems the dialogue state tracker (DST) component is responsible for predicting the state of the dialogue based on the dialogue history. Current DST approaches rely on a predefined domain ontology, a fact that limits their effective usage for large scale conversational agents, where the DST constantly needs to be interfaced with ever-increasing services and APIs. Focused towards overcoming this drawback, we propose a domain-aware dialogue state tracker, that is completely data-driven and it is modeled to predict for dynamic service schemas. The proposed model utilizes domain and slot information to extract both domain and slot specific representations for a given dialogue, and then uses such representations to predict the values of the corresponding slot. Integrating this mechanism with a pretrained language model (i.e. BERT), our approach can effectively learn semantic relations.

    更新日期:2020-01-22
  • Model-based Multi-Agent Reinforcement Learning with Cooperative Prioritized Sweeping
    arXiv.cs.AI Pub Date : 2020-01-15
    Eugenio Bargiacchi; Timothy Verstraeten; Diederik M. Roijers; Ann Nowé

    We present a new model-based reinforcement learning algorithm, Cooperative Prioritized Sweeping, for efficient learning in multi-agent Markov decision processes. The algorithm allows for sample-efficient learning on large problems by exploiting a factorization to approximate the value function. Our approach only requires knowledge about the structure of the problem in the form of a dynamic decision network. Using this information, our method learns a model of the environment and performs temporal difference updates which affect multiple joint states and actions at once. Batch updates are additionally performed which efficiently back-propagate knowledge throughout the factored Q-function. Our method outperforms the state-of-the-art algorithm sparse cooperative Q-learning algorithm, both on the well-known SysAdmin benchmark and randomized environments.

    更新日期:2020-01-22
  • Classification accuracy as a proxy for two sample testing
    arXiv.cs.AI Pub Date : 2016-02-06
    Ilmun Kim; Aaditya Ramdas; Aarti Singh; Larry Wasserman

    When data analysts train a classifier and check if its accuracy is significantly different from chance, they are implicitly performing a two-sample test. We investigate the statistical properties of this flexible approach in the high-dimensional setting. We prove two results that hold for all classifiers in any dimensions: if its true error remains $\epsilon$-better than chance for some $\epsilon>0$ as $d,n \to \infty$, then (a) the permutation-based test is consistent (has power approaching to one), (b) a computationally efficient test based on a Gaussian approximation of the null distribution is also consistent. To get a finer understanding of the rates of consistency, we study a specialized setting of distinguishing Gaussians with mean-difference $\delta$ and common (known or unknown) covariance $\Sigma$, when $d/n \to c \in (0,\infty)$. We study variants of Fisher's linear discriminant analysis (LDA) such as "naive Bayes" in a nontrivial regime when $\epsilon \to 0$ (the Bayes classifier has true accuracy approaching 1/2), and contrast their power with corresponding variants of Hotelling's test. Surprisingly, the expressions for their power match exactly in terms of $n,d,\delta,\Sigma$, and the LDA approach is only worse by a constant factor, achieving an asymptotic relative efficiency (ARE) of $1/\sqrt{\pi}$ for balanced samples. We also extend our results to high-dimensional elliptical distributions with finite kurtosis. Other results of independent interest include minimax lower bounds, and the optimality of Hotelling's test when $d=o(n)$. Simulation results validate our theory, and we present practical takeaway messages along with natural open problems.

    更新日期:2020-01-22
  • Generalization and Regularization in DQN
    arXiv.cs.AI Pub Date : 2018-09-29
    Jesse Farebrother; Marlos C. Machado; Michael Bowling

    Deep reinforcement learning algorithms have shown an impressive ability to learn complex control policies in high-dimensional tasks. However, despite the ever-increasing performance on popular benchmarks, policies learned by deep reinforcement learning algorithms can struggle to generalize when evaluated in remarkably similar environments. In this paper we propose a protocol to evaluate generalization in reinforcement learning through different modes of Atari 2600 games. With that protocol we assess the generalization capabilities of DQN, one of the most traditional deep reinforcement learning algorithms, and we provide evidence suggesting that DQN overspecializes to the training environment. We then comprehensively evaluate the impact of dropout and $\ell_2$ regularization, as well as the impact of reusing learned representations to improve the generalization capabilities of DQN. Despite regularization being largely underutilized in deep reinforcement learning, we show that it can, in fact, help DQN learn more general features. These features can be reused and fine-tuned on similar tasks, considerably improving DQN's sample efficiency.

    更新日期:2020-01-22
  • Random Spiking and Systematic Evaluation of Defenses Against Adversarial Examples
    arXiv.cs.AI Pub Date : 2018-12-05
    Huangyi Ge; Sze Yiu Chau; Bruno Ribeiro; Ninghui Li

    Image classifiers often suffer from adversarial examples, which are generated by strategically adding a small amount of noise to input images to trick classifiers into misclassification. Over the years, many defense mechanisms have been proposed, and different researchers have made seemingly contradictory claims on their effectiveness. We present an analysis of possible adversarial models, and propose an evaluation framework for comparing different defense mechanisms. As part of the framework, we introduce a more powerful and realistic adversary strategy. Furthermore, we propose a new defense mechanism called Random Spiking (RS), which generalizes dropout and introduces random noises in the training process in a controlled manner. Evaluations under our proposed framework suggest RS delivers better protection against adversarial examples than many existing schemes.

    更新日期:2020-01-22
  • Modelling Bounded Rationality in Multi-Agent Interactions by Generalized Recursive Reasoning
    arXiv.cs.AI Pub Date : 2019-01-26
    Ying Wen; Yaodong Yang; Rui Luo; Jun Wang

    Though limited in real-world decision making, most multi-agent reinforcement learning (MARL) models assume perfectly rational agents -- a property hardly met due to individual's cognitive limitation and/or the tractability of the decision problem. In this paper, we introduce generalized recursive reasoning (GR2) as a novel framework to model agents with different \emph{hierarchical} levels of rationality; our framework enables agents to exhibit varying levels of "thinking" ability thereby allowing higher-level agents to best respond to various less sophisticated learners. We contribute both theoretically and empirically. On the theory side, we devise the hierarchical framework of GR2 through probabilistic graphical models and prove the existence of a perfect Bayesian equilibrium. Within the GR2, we propose a practical actor-critic solver, and demonstrate its convergent property to a stationary point in two-player games through Lyapunov analysis. On the empirical side, we validate our findings on a variety of MARL benchmarks. Precisely, we first illustrate the hierarchical thinking process on the Keynes Beauty Contest, and then demonstrate significant improvements compared to state-of-the-art opponent modeling baselines on the normal-form games and the cooperative navigation benchmark.

    更新日期:2020-01-22
  • An Argumentation-Based Reasoner to Assist Digital Investigation and Attribution of Cyber-Attacks
    arXiv.cs.AI Pub Date : 2019-04-30
    Erisa Karafili; Linna Wang; Emil C. Lupu

    We expect an increase in the frequency and severity of cyber-attacks that comes along with the need for efficient security countermeasures. The process of attributing a cyber-attack helps to construct efficient and targeted mitigating and preventive security measures. In this work, we propose an argumentation-based reasoner (ABR) as a proof-of-concept tool that can help a forensics analyst during the analysis of forensic evidence and the attribution process. Given the evidence collected from a cyber-attack, our reasoner can assist the analyst during the investigation process, by helping him/her to analyze the evidence and identify who performed the attack. Furthermore, it suggests to the analyst where to focus further analyses by giving hints of the missing evidence or new investigation paths to follow. ABR is the first automatic reasoner that can combine both technical and social evidence in the analysis of a cyber-attack, and that can also cope with incomplete and conflicting information. To illustrate how ABR can assist in the analysis and attribution of cyber-attacks we have used examples of cyber-attacks and their analyses as reported in publicly available reports and online literature. We do not mean to either agree or disagree with the analyses presented therein or reach attribution conclusions.

    更新日期:2020-01-22
  • Deep Residual Reinforcement Learning
    arXiv.cs.AI Pub Date : 2019-05-03
    Shangtong Zhang; Wendelin Boehmer; Shimon Whiteson

    We revisit residual algorithms in both model-free and model-based reinforcement learning settings. We propose the bidirectional target network technique to stabilize residual algorithms, yielding a residual version of DDPG that significantly outperforms vanilla DDPG in the DeepMind Control Suite benchmark. Moreover, we find the residual algorithm an effective approach to the distribution mismatch problem in model-based planning. Compared with the existing TD($k$) method, our residual-based method makes weaker assumptions about the model and yields a greater performance boost.

    更新日期:2020-01-22
  • Evidence Propagation and Consensus Formation in Noisy Environments
    arXiv.cs.AI Pub Date : 2019-05-13
    Michael Crosscombe; Jonathan Lawry; Palina Bartashevich

    We study the effectiveness of consensus formation in multi-agent systems where there is both belief updating based on direct evidence and also belief combination between agents. In particular, we consider the scenario in which a population of agents collaborate on the best-of-n problem where the aim is to reach a consensus about which is the best (alternatively, true) state from amongst a set of states, each with a different quality value (or level of evidence). Agents' beliefs are represented within Dempster-Shafer theory by mass functions and we investigate the macro-level properties of four well-known belief combination operators for this multi-agent consensus formation problem: Dempster's rule, Yager's rule, Dubois & Prade's operator and the averaging operator. The convergence properties of the operators are considered and simulation experiments are conducted for different evidence rates and noise levels. Results show that a combination of updating on direct evidence and belief combination between agents results in better consensus to the best state than does evidence updating alone. We also find that in this framework the operators are robust to noise. Broadly, Yager's rule is shown to be the better operator under various parameter values, i.e. convergence to the best state, robustness to noise, and scalability.

    更新日期:2020-01-22
  • Toward a Dempster-Shafer theory of concepts
    arXiv.cs.AI Pub Date : 2019-08-14
    Sabine Frittella; Krishna Manoorkar; Alessandra Palmigiano; Apostolos Tzimoulis; Nachoem M. Wijnberg

    In this paper, we generalize the basic notions and results of Dempster-Shafer theory from predicates to formal concepts. Results include the representation of conceptual belief functions as inner measures of suitable probability functions, and a Dempster-Shafer rule of combination on belief functions on formal concepts.

    更新日期:2020-01-22
  • Bayesian Local Sampling-based Planning
    arXiv.cs.AI Pub Date : 2019-09-08
    Tin Lai; Philippe Morere; Fabio Ramos; Gilad Francis

    Sampling-based planning is the predominant paradigm for motion planning in robotics. Most sampling-based planners use a global random sampling scheme to guarantee probabilistic completeness. However, most schemes are often inefficient as the samples drawn from the global proposal distribution, and do not exploit relevant local structures. Local sampling-based motion planners, on the other hand, take sequential decisions of random walks to samples valid trajectories in configuration space. However, current approaches do not adapt their strategies according to the success and failures of past samples. In this work, we introduce a local sampling-based motion planner with a Bayesian learning scheme for modelling an adaptive sampling proposal distribution. The proposal distribution is sequentially updated based on previous samples, consequently shaping it according to local obstacles and constraints in the configuration space. Thus, through learning from past observed outcomes, we maximise the likelihood of sampling in regions that have a higher probability to form trajectories within narrow passages. We provide the formulation of a sample-efficient distribution, along with theoretical foundation of sequentially updating this distribution. We demonstrate experimentally that by using a Bayesian proposal distribution, a solution is found faster, requiring fewer samples, and without any noticeable performance overhead.

    更新日期:2020-01-22
  • Machine learning in healthcare -- a system's perspective
    arXiv.cs.AI Pub Date : 2019-09-14
    Awais Ashfaq; Slawomir Nowaczyk

    A consequence of the fragmented and siloed healthcare landscape is that patient care (and data) is split along multitude of different facilities and computer systems and enabling interoperability between these systems is hard. The lack interoperability not only hinders continuity of care and burdens providers, but also hinders effective application of Machine Learning (ML) algorithms. Thus, most current ML algorithms, designed to understand patient care and facilitate clinical decision-support, are trained on limited datasets. This approach is analogous to the Newtonian paradigm of Reductionism in which a system is broken down into elementary components and a description of the whole is formed by understanding those components individually. A key limitation of the reductionist approach is that it ignores the component-component interactions and dynamics within the system which are often of prime significance in understanding the overall behaviour of complex adaptive systems (CAS). Healthcare is a CAS. Though the application of ML on health data have shown incremental improvements for clinical decision support, ML has a much a broader potential to restructure care delivery as a whole and maximize care value. However, this ML potential remains largely untapped: primarily due to functional limitations of Electronic Health Records (EHR) and the inability to see the healthcare system as a whole. This viewpoint (i) articulates the healthcare as a complex system which has a biological and an organizational perspective, (ii) motivates with examples, the need of a system's approach when addressing healthcare challenges via ML and, (iii) emphasizes to unleash EHR functionality - while duly respecting all ethical and legal concerns - to reap full benefits of ML.

    更新日期:2020-01-22
  • Efficient Local Causal Discovery Based on Markov Blanket
    arXiv.cs.AI Pub Date : 2019-10-03
    Shuai Yang; Hao Wang; Xuegang hu

    We study the problem of local causal discovery learning which identifies direct causes and effects of a target variable of interest in a causal network. The existing constraint-based local causal discovery approaches are inefficient, since these approaches do not take a triangular structure formed by a given variable and its child variables into account in learning local causal structure, and hence need to spend much time in distinguishing several direct effects. Additionally, these approaches depend on the standard MB (Markov Blanket) or PC (Parent and Children) discovery algorithms which demand to conduct lots of conditional independence tests to obtain the MB or PC sets. To overcome the above problems, in this paper, we propose a novel Efficient Local Causal Discovery algorithm via MB (ELCD) to identify direct causes and effects of a given variable. More specifically, we design a new algorithm for Efficient Oriented MB discovery, name EOMB. EOMB not only utilizes fewer conditional independence tests to identify MB, but also is able to identify more direct effects of a given variable with the help of triangular causal structures and determine several direct causes as much as possible. In addition, based on the proposed EOMB, ELCD is presented to learn a local causal structure around a target variable. The benefits of ELCD are that it not only can determine the direct causes and effects of a given variable accurately, but also runs faster than other local causal discovery algorithms. Experimental results on eight Bayesian networks (BNs) show that our proposed approach performs better than state-of-the-art baseline methods.

    更新日期:2020-01-22
  • DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames
    arXiv.cs.AI Pub Date : 2019-11-01
    Erik Wijmans; Abhishek Kadian; Ari Morcos; Stefan Lee; Irfan Essa; Devi Parikh; Manolis Savva; Dhruv Batra

    We present Decentralized Distributed Proximal Policy Optimization (DD-PPO), a method for distributed reinforcement learning in resource-intensive simulated environments. DD-PPO is distributed (uses multiple machines), decentralized (lacks a centralized server), and synchronous (no computation is ever stale), making it conceptually simple and easy to implement. In our experiments on training virtual robots to navigate in Habitat-Sim, DD-PPO exhibits near-linear scaling -- achieving a speedup of 107x on 128 GPUs over a serial implementation. We leverage this scaling to train an agent for 2.5 Billion steps of experience (the equivalent of 80 years of human experience) -- over 6 months of GPU-time training in under 3 days of wall-clock time with 64 GPUs. This massive-scale training not only sets the state of art on Habitat Autonomous Navigation Challenge 2019, but essentially solves the task --near-perfect autonomous navigation in an unseen environment without access to a map, directly from an RGB-D camera and a GPS+Compass sensor. Fortuitously, error vs computation exhibits a power-law-like distribution; thus, 90% of peak performance is obtained relatively early (at 100 million steps) and relatively cheaply (under 1 day with 8 GPUs). Finally, we show that the scene understanding and navigation policies learned can be transferred to other navigation tasks -- the analog of ImageNet pre-training + task-specific fine-tuning for embodied AI. Our model outperforms ImageNet pre-trained CNNs on these transfer tasks and can serve as a universal resource (all models and code are publicly available).

    更新日期:2020-01-22
  • Optimal Farsighted Agents Tend to Seek Power
    arXiv.cs.AI Pub Date : 2019-12-03
    Alexander Matt Turner

    Some researchers have speculated that capable reinforcement learning (RL) agents pursuing misspecified objectives are often incentivized to seek resources and power in pursuit of those objectives. An agent seeking power is incentivized to behave in undesirable ways, including rationally preventing deactivation and correction. Others have voiced skepticism: humans seem idiosyncratic in their urges to power, which need not be present in the agents we design. We formalize a notion of power within the context of finite deterministic Markov decision processes (MDPs). We prove that, with respect to a neutral class of reward function distributions, optimal policies tend to seek power over the environment.

    更新日期:2020-01-22
  • A Survey on the Use of Preferences for Virtual Machine Placement in Cloud Data Centers
    arXiv.cs.AI Pub Date : 2019-07-17
    Abdulaziz Alashaikh; Eisa Alanazi; Ala Al-Fuqaha

    With the rapid development of virtualization techniques, cloud data centers allow for cost effective, flexible, and customizable deployments of applications on virtualized infrastructure. Virtual machine (VM) placement aims to assign each virtual machine to a server in the cloud environment. VM Placement is of paramount importance to the design of cloud data centers. Typically, VM placement involves complex relations and multiple design factors as well as local policies that govern the assignment decisions. It also involves different constituents including cloud administrators and customers that might have disparate preferences while opting for a placement solution. Thus, it is often valuable to not only return an optimized solution to the VM placement problem but also a solution that reflects the given preferences of the constituents. In this paper, we provide a detailed review on the role of preferences in the recent literature on VM placement. We further discuss key challenges and identify possible research opportunities to better incorporate preferences within the context of VM placement.

    更新日期:2020-01-22
  • Modeling and solving the multimodal car- and ride-sharing problem
    arXiv.cs.AI Pub Date : 2020-01-15
    Miriam Enzi; Sophie N. Parragh; David Pisinger; Matthias Prandtstetter

    We introduce the multimodal car- and ride-sharing problem (MMCRP), in which a pool of cars is used to cover a set of ride requests, while uncovered requests are assigned to other modes of transport (MOT). A car's route consists of one or more trips. Each trip must have a specific but non-predetermined driver, start in a depot and finish in a (possibly different) depot. Ride-sharing between users is allowed, even when two rides do not have the same origin and/or destination. A user has always the option of using other modes of transport according to an individual list of preferences. The problem can be formulated as a vehicle scheduling problem. In order to solve the problem, an auxiliary graph is constructed in which each trip starting and ending in a depot, and covering possible ride-shares, is modeled as an edge in a time-space graph. We propose a two-layer decomposition algorithm based on column generation, where the master problem ensures that each request can only be covered at most once, and the pricing problem generates new promising routes by solving a kind of shortest path problem in a time-space network. Computational experiments based on realistic instances are reported. The benchmark instances are based on demographic, spatial, and economic data of Vienna, Austria. We solve large instances with the column generation based approach to near optimality in reasonable time, and we further investigate various exact and heuristic pricing schemes.

    更新日期:2020-01-17
  • Stereotypical Bias Removal for Hate Speech Detection Task using Knowledge-based Generalizations
    arXiv.cs.AI Pub Date : 2020-01-15
    Pinkesh Badjatiya; Manish Gupta; Vasudeva Varma

    With the ever-increasing cases of hate spread on social media platforms, it is critical to design abuse detection mechanisms to proactively avoid and control such incidents. While there exist methods for hate speech detection, they stereotype words and hence suffer from inherently biased training. Bias removal has been traditionally studied for structured datasets, but we aim at bias mitigation from unstructured text data. In this paper, we make two important contributions. First, we systematically design methods to quantify the bias for any model and propose algorithms for identifying the set of words which the model stereotypes. Second, we propose novel methods leveraging knowledge-based generalizations for bias-free learning. Knowledge-based generalization provides an effective way to encode knowledge because the abstraction they provide not only generalizes content but also facilitates retraction of information from the hate speech detection classifier, thereby reducing the imbalance. We experiment with multiple knowledge generalization policies and analyze their effect on general performance and in mitigating bias. Our experiments with two real-world datasets, a Wikipedia Talk Pages dataset (WikiDetox) of size ~96k and a Twitter dataset of size ~24k, show that the use of knowledge-based generalizations results in better performance by forcing the classifier to learn from generalized content. Our methods utilize existing knowledge-bases and can easily be extended to other tasks

    更新日期:2020-01-17
  • Consumer-Driven Explanations for Machine Learning Decisions: An Empirical Study of Robustness
    arXiv.cs.AI Pub Date : 2020-01-13
    Michael Hind; Dennis Wei; Yunfeng Zhang

    Many proposed methods for explaining machine learning predictions are in fact challenging to understand for nontechnical consumers. This paper builds upon an alternative consumer-driven approach called TED that asks for explanations to be provided in training data, along with target labels. Using semi-synthetic data from credit approval and employee retention applications, experiments are conducted to investigate some practical considerations with TED, including its performance with different classification algorithms, varying numbers of explanations, and variability in explanations. A new algorithm is proposed to handle the case where some training examples do not have explanations. Our results show that TED is robust to increasing numbers of explanations, noisy explanations, and large fractions of missing explanations, thus making advances toward its practical deployment.

    更新日期:2020-01-17
  • Graph Attentional Autoencoder for Anticancer Hyperfood Prediction
    arXiv.cs.AI Pub Date : 2020-01-16
    Guadalupe Gonzalez; Shunwang Gong; Ivan Laponogov; Kirill Veselkov; Michael Bronstein

    Recent research efforts have shown the possibility to discover anticancer drug-like molecules in food from their effect on protein-protein interaction networks, opening a potential pathway to disease-beating diet design. We formulate this task as a graph classification problem on which graph neural networks (GNNs) have achieved state-of-the-art results. However, GNNs are difficult to train on sparse low-dimensional features according to our empirical evidence. Here, we present graph augmented features, integrating graph structural information and raw node attributes with varying ratios, to ease the training of networks. We further introduce a novel neural network architecture on graphs, the Graph Attentional Autoencoder (GAA) to predict food compounds with anticancer properties based on perturbed protein networks. We demonstrate that the method outperforms the baseline approach and state-of-the-art graph classification models in this task.

    更新日期:2020-01-17
  • Broadening Label-based Argumentation Semantics with May-Must Scales
    arXiv.cs.AI Pub Date : 2020-01-16
    Ryuta Arisaka; Takayuki Ito

    The semantics as to which set of arguments in a given argumentation graph may be acceptable (acceptability semantics) can be characterised in a few different ways. Among them, labelling-based approach allows for concise and flexible determination of acceptability statuses of arguments through assignment of a label indicating acceptance, rejection, or undecided to each argument. In this work, we contemplate a way of broadening it by accommodating may- and must- conditions for an argument to be accepted or rejected, as determined by the number(s) of rejected and accepted attacking arguments. We show that the broadened label-based semantics can be used to express more mild indeterminacy than inconsistency for acceptability judgement when, for example, it may be the case that an argument is accepted and when it may also be the case that it is rejected. We identify that finding which conditions a labelling satisfies for every argument can be an undecidable problem, which has an unfavourable implication to semantics. We propose to address this problem by enforcing a labelling to maximally respect the conditions, while keeping the rest that would necessarily cause non-termination labelled undecided.

    更新日期:2020-01-17
  • End-to-End Pixel-Based Deep Active Inference for Body Perception and Action
    arXiv.cs.AI Pub Date : 2019-12-28
    Cansu Sancaktar; Pablo Lanillos

    We present a pixel-based deep Active Inference algorithm (PixelAI) inspired in human body perception and successfully validated in robot body perception and action as a use case. Our algorithm combines the free energy principle from neuroscience, rooted in variational inference, with deep convolutional decoders to scale the algorithm to directly deal with images input and provide online adaptive inference. The approach enables the robot to perform 1) dynamical body estimation of arm using only raw monocular camera images and 2) autonomous reaching to "imagined" arm poses in the visual space. We statistically analyzed the algorithm performance in a simulated and a real Nao robot. Results show how the same algorithm deals with both perception an action, modelled as an inference optimization problem.

    更新日期:2020-01-17
  • "Why is 'Chicago' deceptive?" Towards Building Model-Driven Tutorials for Humans
    arXiv.cs.AI Pub Date : 2020-01-14
    Vivian Lai; Han Liu; Chenhao Tan

    To support human decision making with machine learning models, we often need to elucidate patterns embedded in the models that are unsalient, unknown, or counterintuitive to humans. While existing approaches focus on explaining machine predictions with real-time assistance, we explore model-driven tutorials to help humans understand these patterns in a training phase. We consider both tutorials with guidelines from scientific papers, analogous to current practices of science communication, and automatically selected examples from training data with explanations. We use deceptive review detection as a testbed and conduct large-scale, randomized human-subject experiments to examine the effectiveness of such tutorials. We find that tutorials indeed improve human performance, with and without real-time assistance. In particular, although deep learning provides superior predictive performance than simple models, tutorials and explanations from simple models are more useful to humans. Our work suggests future directions for human-centered tutorials and explanations towards a synergy between humans and AI.

    更新日期:2020-01-17
  • On Expert Behaviors and Question Types for Efficient Query-Based Ontology Fault Localization
    arXiv.cs.AI Pub Date : 2020-01-16
    Patrick Rodler

    We challenge existing query-based ontology fault localization methods wrt. assumptions they make, criteria they optimize, and interaction means they use. We find that their efficiency depends largely on the behavior of the interacting expert, that performed calculations can be inefficient or imprecise, and that used optimization criteria are often not fully realistic. As a remedy, we suggest a novel (and simpler) interaction approach which overcomes all identified problems and, in comprehensive experiments on faulty real-world ontologies, enables a successful fault localization while requiring fewer expert interactions in 66 % of the cases, and always at least 80 % less expert waiting time, compared to existing methods.

    更新日期:2020-01-17
  • #MeToo on Campus: Studying College Sexual Assault at Scale Using Data Reported on Social Media
    arXiv.cs.AI Pub Date : 2020-01-16
    Viet Duong; Phu Pham; Ritwik Bose; Jiebo Luo

    Recently, the emergence of the #MeToo trend on social media has empowered thousands of people to share their own sexual harassment experiences. This viral trend, in conjunction with the massive personal information and content available on Twitter, presents a promising opportunity to extract data driven insights to complement the ongoing survey based studies about sexual harassment in college. In this paper, we analyze the influence of the #MeToo trend on a pool of college followers. The results show that the majority of topics embedded in those #MeToo tweets detail sexual harassment stories, and there exists a significant correlation between the prevalence of this trend and official reports on several major geographical regions. Furthermore, we discover the outstanding sentiments of the #MeToo tweets using deep semantic meaning representations and their implications on the affected users experiencing different types of sexual harassment. We hope this study can raise further awareness regarding sexual misconduct in academia.

    更新日期:2020-01-17
  • Adversarially Guided Self-Play for Adopting Social Conventions
    arXiv.cs.AI Pub Date : 2020-01-16
    Mycal Tucker; Yilun Zhou; Julie Shah

    Robotic agents must adopt existing social conventions in order to be effective teammates. These social conventions, such as driving on the right or left side of the road, are arbitrary choices among optimal policies, but all agents on a successful team must use the same convention. Prior work has identified a method of combining self-play with paired input-output data gathered from existing agents in order to learn their social convention without interacting with them. We build upon this work by introducing a technique called Adversarial Self-Play (ASP) that uses adversarial training to shape the space of possible learned policies and substantially improves learning efficiency. ASP only requires the addition of unpaired data: a dataset of outputs produced by the social convention without associated inputs. Theoretical analysis reveals how ASP shapes the policy space and the circumstances (when behaviors are clustered or exhibit some other structure) under which it offers the greatest benefits. Empirical results across three domains confirm ASP's advantages: it produces models that more closely match the desired social convention when given as few as two paired datapoints.

    更新日期:2020-01-17
  • Spectral Inference Networks: Unifying Deep and Spectral Learning
    arXiv.cs.AI Pub Date : 2018-06-06
    David Pfau; Stig Petersen; Ashish Agarwal; David G. T. Barrett; Kimberly L. Stachenfeld

    We present Spectral Inference Networks, a framework for learning eigenfunctions of linear operators by stochastic optimization. Spectral Inference Networks generalize Slow Feature Analysis to generic symmetric operators, and are closely related to Variational Monte Carlo methods from computational physics. As such, they can be a powerful tool for unsupervised representation learning from video or graph-structured data. We cast training Spectral Inference Networks as a bilevel optimization problem, which allows for online learning of multiple eigenfunctions. We show results of training Spectral Inference Networks on problems in quantum mechanics and feature learning for videos on synthetic datasets. Our results demonstrate that Spectral Inference Networks accurately recover eigenfunctions of linear operators and can discover interpretable representations from video in a fully unsupervised manner.

    更新日期:2020-01-17
  • Reinforcement-learning-based architecture for automated quantum adiabatic algorithm design
    arXiv.cs.AI Pub Date : 2018-12-27
    Jian Lin; Zhong Yuan Lai; Xiaopeng Li

    Quantum algorithm design lies in the hallmark of applications of quantum computation and quantum simulation. Here we put forward a deep reinforcement learning (RL) architecture for automated algorithm design in the framework of quantum adiabatic algorithm, where the optimal Hamiltonian path to reach a quantum ground state that encodes a compution problem is obtained by RL techniques. We benchmark our approach in Grover search and 3-SAT problems, and find that the adiabatic algorithm obtained by our RL approach leads to significant improvement in the success probability and computing speedups for both moderate and large number of qubits compared to conventional algorithms. The RL-designed algorithm is found to be qualitatively distinct from the linear algorithm in the resultant distribution of success probability. Considering the established complexity-equivalence of circuit and adiabatic quantum algorithms, we expect the RL-designed adiabatic algorithm to inspire novel circuit algorithms as well. Our approach offers a recipe to design quantum algorithms for generic problems through a machinery RL process, which paves a novel way to automated quantum algorithm design using artificial intelligence, potentially applicable to different quantum simulation and computation platforms from trapped ions and optical lattices to superconducting-qubit devices.

    更新日期:2020-01-17
Contents have been reproduced by permission of the publishers.
导出
全部期刊列表>>
2020新春特辑
限时免费阅读临床医学内容
ACS材料视界
科学报告最新纳米科学与技术研究
清华大学化学系段昊泓
自然科研论文编辑服务
加州大学洛杉矶分校
上海纽约大学William Glover
南开大学化学院周其林
课题组网站
X-MOL
北京大学分子工程苏南研究院
华东师范大学分子机器及功能材料
中山大学化学工程与技术学院
试剂库存
天合科研
down
wechat
bug