当前期刊: Machine Learning Go to current issue    加入关注    本刊投稿指南
显示样式:        排序: IF: - GO 导出
  • CPAS: the UK’s national machine learning-based hospital capacity planning system for COVID-19
    Mach. Learn. (IF 2.672) Pub Date : 2020-11-24
    Zhaozhi Qian, Ahmed M. Alaa, Mihaela van der Schaar

    The coronavirus disease 2019 (COVID-19) global pandemic poses the threat of overwhelming healthcare systems with unprecedented demands for intensive care resources. Managing these demands cannot be effectively conducted without a nationwide collective effort that relies on data to forecast hospital demands on the national, regional, hospital and individual levels. To this end, we developed the COVID-19

  • Correction to: Fast and accurate pseudoinverse with sparse matrix reordering and incremental approach
    Mach. Learn. (IF 2.672) Pub Date : 2020-11-23
    Jinhong Jung, Lee Sael

    The following acknowledgments were inadvertently left out of the published article.

  • Conditional variance penalties and domain shift robustness
    Mach. Learn. (IF 2.672) Pub Date : 2020-11-23
    Christina Heinze-Deml, Nicolai Meinshausen

    When training a deep neural network for image classification, one can broadly distinguish between two types of latent features of images that will drive the classification. We can divide latent features into (i) ‘core’ or ‘conditionally invariant’ features \(C\) whose distribution \(C\vert Y\), conditional on the class Y, does not change substantially across domains and (ii) ‘style’ features \(S\)

  • LoRAS: an oversampling approach for imbalanced datasets
    Mach. Learn. (IF 2.672) Pub Date : 2020-11-12
    Saptarshi Bej, Narek Davtyan, Markus Wolfien, Mariam Nassar, Olaf Wolkenhauer

    The Synthetic Minority Oversampling TEchnique (SMOTE) is widely-used for the analysis of imbalanced datasets. It is known that SMOTE frequently over-generalizes the minority class, leading to misclassifications for the majority class, and effecting the overall balance of the model. In this article, we present an approach that overcomes this limitation of SMOTE, employing Localized Random Affine Shadowsampling

  • Imputation of clinical covariates in time series
    Mach. Learn. (IF 2.672) Pub Date : 2020-11-10
    Dimitris Bertsimas, Agni Orfanoudaki, Colin Pawlowski

    Missing data is a common problem in longitudinal datasets which include multiple instances of the same individual observed at different points in time. We introduce a new approach, MedImpute, for imputing missing clinical covariates in multivariate panel data. This approach integrates patient specific information into an optimization formulation that can be adjusted for different imputation algorithms

  • Probabilistic inductive constraint logic
    Mach. Learn. (IF 2.672) Pub Date : 2020-11-10
    Fabrizio Riguzzi, Elena Bellodi, Riccardo Zese, Marco Alberti, Evelina Lamma

    Probabilistic logical models deal effectively with uncertain relations and entities typical of many real world domains. In the field of probabilistic logic programming usually the aim is to learn these kinds of models to predict specific atoms or predicates of the domain, called target atoms/predicates. However, it might also be useful to learn classifiers for interpretations as a whole: to this end

  • Binary classification with ambiguous training data
    Mach. Learn. (IF 2.672) Pub Date : 2020-11-03
    Naoya Otani, Yosuke Otsubo, Tetsuya Koike, Masashi Sugiyama

    In supervised learning, we often face with ambiguous (A) samples that are difficult to label even by domain experts. In this paper, we consider a binary classification problem in the presence of such A samples. This problem is substantially different from semi-supervised learning since unlabeled samples are not necessarily difficult samples. Also, it is different from 3-class classification with the

  • Robust high dimensional expectation maximization algorithm via trimmed hard thresholding
    Mach. Learn. (IF 2.672) Pub Date : 2020-11-02
    Di Wang, Xiangyu Guo, Shi Li, Jinhui Xu

    In this paper, we study the problem of estimating latent variable models with arbitrarily corrupted samples in high dimensional space (i.e., \(d\gg n\)) where the underlying parameter is assumed to be sparse. Specifically, we propose a method called Trimmed (Gradient) Expectation Maximization which adds a trimming gradients step and a hard thresholding step to the Expectation step (E-step) and the

  • Spanning attack: reinforce black-box attacks with unlabeled data
    Mach. Learn. (IF 2.672) Pub Date : 2020-10-29
    Lu Wang, Huan Zhang, Jinfeng Yi, Cho-Jui Hsieh, Yuan Jiang

    Adversarial black-box attacks aim to craft adversarial perturbations by querying input–output pairs of machine learning models. They are widely used to evaluate the robustness of pre-trained models. However, black-box attacks often suffer from the issue of query inefficiency due to the high dimensionality of the input space, and therefore incur a false sense of model robustness. In this paper, we relax

  • Incremental predictive clustering trees for online semi-supervised multi-target regression
    Mach. Learn. (IF 2.672) Pub Date : 2020-10-28
    Aljaž Osojnik, Panče Panov, Sašo Džeroski

    In many application settings, labeling data examples is a costly endeavor, while unlabeled examples are abundant and cheap to produce. Labeling examples can be particularly problematic in an online setting, where there can be arbitrarily many examples that arrive at high frequencies. It is also problematic when we need to predict complex values (e.g., multiple real values), a task that has started

  • Learning with mitigating random consistency from the accuracy measure
    Mach. Learn. (IF 2.672) Pub Date : 2020-10-27
    Jieting Wang, Yuhua Qian, Feijiang Li

    Human beings may make random guesses in decision-making. Occasionally, their guesses may generate consistency with the real situation. This kind of consistency is termed random consistency. In the area of machine leaning, the randomness is unavoidable and ubiquitous in learning algorithms. However, the accuracy (A), which is a fundamental performance measure for machine learning, does not recognize

  • Boost image captioning with knowledge reasoning
    Mach. Learn. (IF 2.672) Pub Date : 2020-10-27
    Feicheng Huang, Zhixin Li, Haiyang Wei, Canlong Zhang, Huifang Ma

    Automatically generating a human-like description for a given image is a potential research in artificial intelligence, which has attracted a great of attention recently. Most of the existing attention methods explore the mapping relationships between words in sentence and regions in image, such unpredictable matching manner sometimes causes inharmonious alignments that may reduce the quality of generated

  • Fast and accurate pseudoinverse with sparse matrix reordering and incremental approach
    Mach. Learn. (IF 2.672) Pub Date : 2020-10-27
    Jinhong Jung, Lee Sael

    How can we compute the pseudoinverse of a sparse feature matrix efficiently and accurately for solving optimization problems? A pseudoinverse is a generalization of a matrix inverse, which has been extensively utilized as a fundamental building block for solving linear systems in machine learning. However, an approximate computation, let alone an exact computation, of pseudoinverse is very time-consuming

  • Multi-label feature ranking with ensemble methods
    Mach. Learn. (IF 2.672) Pub Date : 2020-10-13
    Matej Petković, Sašo Džeroski, Dragi Kocev

    In this paper, we propose three ensemble-based feature ranking scores for multi-label classification (MLC), which is a generalisation of multi-class classification where the classes are not mutually exclusive. Each of the scores (Symbolic, Genie3 and Random forest) can be computed from three different ensembles of predictive clustering trees: Bagging, Random forest and Extra trees. We extensively evaluate

  • Evaluating time series forecasting models: an empirical study on performance estimation methods
    Mach. Learn. (IF 2.672) Pub Date : 2020-10-13
    Vitor Cerqueira, Luis Torgo, Igor Mozetič

    Performance estimation aims at estimating the loss that a predictive model will incur on unseen data. This process is a fundamental stage in any machine learning project. In this paper we study the application of these methods to time series forecasting tasks. For independent and identically distributed data the most common approach is cross-validation. However, the dependency among observations in

  • Fast greedy $$\mathcal {C}$$ C -bound minimization with guarantees
    Mach. Learn. (IF 2.672) Pub Date : 2020-09-23
    Baptiste Bauvin, Cécile Capponi, Jean-Francis Roy, François Laviolette

    The \(\mathcal {C}\)-bound is a tight bound on the true risk of a majority vote classifier that relies on the individual quality and pairwise disagreement of the voters and provides PAC-Bayesian generalization guarantees. Based on this bound, MinCq is a classification algorithm that returns a dense distribution on a finite set of voters by minimizing it. Introduced later and inspired by boosting, CqBoost

  • High-dimensional Bayesian optimization using low-dimensional feature spaces
    Mach. Learn. (IF 2.672) Pub Date : 2020-09-21
    Riccardo Moriconi, Marc Peter Deisenroth, K. S. Sesh Kumar

    Bayesian optimization (BO) is a powerful approach for seeking the global optimum of expensive black-box functions and has proven successful for fine tuning hyper-parameters of machine learning models. However, BO is practically limited to optimizing 10–20 parameters. To scale BO to high dimensions, we usually make structural assumptions on the decomposition of the objective and/or exploit the intrinsic

  • Combining Bayesian optimization and Lipschitz optimization
    Mach. Learn. (IF 2.672) Pub Date : 2020-09-16
    Mohamed Osama Ahmed; Sharan Vaswani; Mark Schmidt

    Bayesian optimization and Lipschitz optimization have developed alternative techniques for optimizing black-box functions. They each exploit a different form of prior about the function. In this work, we explore strategies to combine these techniques for better global optimization. In particular, we propose ways to use the Lipschitz continuity assumption within traditional BO algorithms, which we call

  • Statistical hierarchical clustering algorithm for outlier detection in evolving data streams
    Mach. Learn. (IF 2.672) Pub Date : 2020-09-04
    Dalibor Krleža, Boris Vrdoljak, Mario Brčić

    Anomaly detection is a hard data analysis process that requires constant creation and improvement of data analysis algorithms. Using traditional clustering algorithms to analyse data streams is impossible due to processing power and memory issues. To solve this, the traditional clustering algorithm complexity needed to be reduced, which led to the creation of sequential clustering algorithms. The usual

  • Imbalanced regression and extreme value prediction
    Mach. Learn. (IF 2.672) Pub Date : 2020-09-04
    Rita P. Ribeiro, Nuno Moniz

    Research in imbalanced domain learning has almost exclusively focused on solving classification tasks for accurate prediction of cases labelled with a rare class. Approaches for addressing such problems in regression tasks are still scarce due to two main factors. First, standard regression tasks assume each domain value as equally important. Second, standard evaluation metrics focus on assessing the

  • Ada-boundary: accelerating DNN training via adaptive boundary batch selection
    Mach. Learn. (IF 2.672) Pub Date : 2020-09-04
    Hwanjun Song, Sundong Kim, Minseok Kim, Jae-Gil Lee

    Neural networks converge faster with help from a smart batch selection strategy. In this regard, we propose Ada-Boundary, a novel and simple adaptive batch selection algorithm that constructs an effective mini-batch according to the learning progress of the model. Our key idea is to exploit confusing samples for which the model cannot predict labels with high confidence. Thus, samples near the current

  • Skew Gaussian processes for classification
    Mach. Learn. (IF 2.672) Pub Date : 2020-09-04
    Alessio Benavoli, Dario Azzimonti, Dario Piga

    Gaussian processes (GPs) are distributions over functions, which provide a Bayesian nonparametric approach to regression and classification. In spite of their success, GPs have limited use in some applications, for example, in some cases a symmetric distribution with respect to its mean is an unreasonable model. This implies, for instance, that the mean and the median coincide, while the mean and median

  • A decision-theoretic approach for model interpretability in Bayesian framework
    Mach. Learn. (IF 2.672) Pub Date : 2020-09-04
    Homayun Afrabandpey, Tomi Peltola, Juho Piironen, Aki Vehtari, Samuel Kaski

    A salient approach to interpretable machine learning is to restrict modeling to simple models. In the Bayesian framework, this can be pursued by restricting the model structure and prior to favor interpretable models. Fundamentally, however, interpretability is about users’ preferences, not the data generation mechanism; it is more natural to formulate interpretability as a utility function. In this

  • Weak approximation of transformed stochastic gradient MCMC
    Mach. Learn. (IF 2.672) Pub Date : 2020-09-04
    Soma Yokoi, Takuma Otsuka, Issei Sato

    Stochastic gradient Langevin dynamics (SGLD) is a computationally efficient sampler for Bayesian posterior inference given a large scale dataset and a complex model. Although SGLD is designed for unbounded random variables, practical models often incorporate variables within a bounded domain, such as non-negative or a finite interval. The use of variable transformation is a typical way to handle such

  • Co-eye: a multi-resolution ensemble classifier for symbolically approximated time series
    Mach. Learn. (IF 2.672) Pub Date : 2020-08-26
    Zahraa S. Abdallah, Mohamed Medhat Gaber

    Time series classification (TSC) is a challenging task that attracted many researchers in the last few years. One main challenge in TSC is the diversity of domains where time series data come from. Thus, there is no “one model that fits all” in TSC. Some algorithms are very accurate in classifying a specific type of time series when the whole series is considered, while some only target the existence/non-existence

  • Bonsai: diverse and shallow trees for extreme multi-label classification
    Mach. Learn. (IF 2.672) Pub Date : 2020-08-23
    Sujay Khandagale, Han Xiao, Rohit Babbar

    Extreme multi-label classification (XMC) refers to supervised multi-label learning involving hundreds of thousands or even millions of labels. In this paper, we develop a suite of algorithms, called Bonsai, which generalizes the notion of label representation in XMC, and partitions the labels in the representation space to learn shallow trees. We show three concrete realizations of this label representation

  • Ensembles of extremely randomized predictive clustering trees for predicting structured outputs
    Mach. Learn. (IF 2.672) Pub Date : 2020-08-17
    Dragi Kocev, Michelangelo Ceci, Tomaž Stepišnik

    We address the task of learning ensembles of predictive models for structured output prediction (SOP). We focus on three SOP tasks: multi-target regression (MTR), multi-label classification (MLC) and hierarchical multi-label classification (HMC). In contrast to standard classification and regression, where the output is a single (discrete or continuous) variable, in SOP the output is a data structure—a

  • Interpretable clustering: an optimization approach
    Mach. Learn. (IF 2.672) Pub Date : 2020-08-16
    Dimitris Bertsimas, Agni Orfanoudaki, Holly Wiberg

    State-of-the-art clustering algorithms provide little insight into the rationale for cluster membership, limiting their interpretability. In complex real-world applications, the latter poses a barrier to machine learning adoption when experts are asked to provide detailed explanations of their algorithms’ recommendations. We present a new unsupervised learning method that leverages Mixed Integer Optimization

  • Learning representations from dendrograms
    Mach. Learn. (IF 2.672) Pub Date : 2020-08-16
    Morteza Haghir Chehreghani, Mostafa Haghir Chehreghani

    We propose unsupervised representation learning and feature extraction from dendrograms. The commonly used Minimax distance measures correspond to building a dendrogram with single linkage criterion, with defining specific forms of a level function and a distance function over that. Therefore, we extend this method to arbitrary dendrograms. We develop a generalized framework wherein different distance

  • Using error decay prediction to overcome practical issues of deep active learning for named entity recognition
    Mach. Learn. (IF 2.672) Pub Date : 2020-08-05
    Haw-Shiuan Chang, Shankar Vembu, Sunil Mohan, Rheeya Uppaal, Andrew McCallum

    Existing deep active learning algorithms achieve impressive sampling efficiency on natural language processing tasks. However, they exhibit several weaknesses in practice, including (a) inability to use uncertainty sampling with black-box models, (b) lack of robustness to labeling noise, and (c) lack of transparency. In response, we propose a transparent batch active sampling framework by estimating

  • Predicting rice phenotypes with meta and multi-target learning
    Mach. Learn. (IF 2.672) Pub Date : 2020-08-02
    Oghenejokpeme I. Orhobor, Nickolai N. Alexandrov, Ross D. King

    The features in some machine learning datasets can naturally be divided into groups. This is the case with genomic data, where features can be grouped by chromosome. In many applications it is common for these groupings to be ignored, as interactions may exist between features belonging to different groups. However, including a group that does not influence a response introduces noise when fitting

  • Node classification over bipartite graphs through projection
    Mach. Learn. (IF 2.672) Pub Date : 2020-07-28
    Marija Stankova, Stiene Praet, David Martens, Foster Provost

    Many real-world large datasets correspond to bipartite graph data settings—think for example of users rating movies or people visiting locations. Although there has been some prior work on data analysis with such bigraphs, no general network-oriented methodology has been proposed yet to perform node classification. In this paper we propose a three-stage classification framework that effectively deals

  • Unsupervised representation learning with Minimax distance measures
    Mach. Learn. (IF 2.672) Pub Date : 2020-07-28
    Morteza Haghir Chehreghani

    We investigate the use of Minimax distances to extract in a nonparametric way the features that capture the unknown underlying patterns and structures in the data. We develop a general-purpose and computationally efficient framework to employ Minimax distances with many machine learning methods that perform on numerical data. We study both computing the pairwise Minimax distances for all pairs of objects

  • Embedding-based Silhouette community detection
    Mach. Learn. (IF 2.672) Pub Date : 2020-07-27
    Blaž Škrlj, Jan Kralj, Nada Lavrač

    Mining complex data in the form of networks is of increasing interest in many scientific disciplines. Network communities correspond to densely connected subnetworks, and often represent key functional parts of real-world systems. This paper proposes the embedding-based Silhouette community detection (SCD), an approach for detecting communities, based on clustering of network node embeddings, i.e.

  • The voice of optimization
    Mach. Learn. (IF 2.672) Pub Date : 2020-07-19
    Dimitris Bertsimas, Bartolomeo Stellato

    We introduce the idea that using optimal classification trees (OCTs) and optimal classification trees with-hyperplanes (OCT-Hs), interpretable machine learning algorithms developed by Bertsimas and Dunn (Mach Learn 106(7):1039–1082, 2017), we are able to obtain insight on the strategy behind the optimal solution in continuous and mixed-integer convex optimization problem as a function of key parameters

  • Reflections on reciprocity in research.
    Mach. Learn. (IF 2.672) Pub Date : 2020-07-16
    Peter A Flach

  • Double random forest
    Mach. Learn. (IF 2.672) Pub Date : 2020-07-02
    Sunwoo Han; Hyunjoong Kim; Yung-Seop Lee

    Random forest (RF) is one of the most popular parallel ensemble methods, using decision trees as classifiers. One of the hyper-parameters to choose from for RF fitting is the nodesize, which determines the individual tree size. In this paper, we begin with the observation that for many data sets (34 out of 58), the best RF prediction accuracy is achieved when the trees are grown fully by minimizing

  • Propositionalization and embeddings: two sides of the same coin.
    Mach. Learn. (IF 2.672) Pub Date : 2020-06-28
    Nada Lavrač,Blaž Škrlj,Marko Robnik-Šikonja

    Data preprocessing is an important component of machine learning pipelines, which requires ample time and resources. An integral part of preprocessing is data transformation into the format required by a given learning algorithm. This paper outlines some of the modern data processing techniques used in relational learning that enable data fusion from different input data types and formats into a single

  • An empirical analysis of binary transformation strategies and base algorithms for multi-label learning
    Mach. Learn. (IF 2.672) Pub Date : 2020-06-10
    Adriano Rivolli; Jesse Read; Carlos Soares; Bernhard Pfahringer; André C. P. L. F. de Carvalho

    Investigating strategies that are able to efficiently deal with multi-label classification tasks is a current research topic in machine learning. Many methods have been proposed, making the selection of the most suitable strategy a challenging issue. From this premise, this paper presents an extensive empirical analysis of the binary transformation strategies and base algorithms for multi-label learning

  • Correction to: Efficient feature selection using shrinkage estimators
    Mach. Learn. (IF 2.672) Pub Date : 2020-06-04
    Konstantinos Sechidis, Laura Azzimonti, Adam Pocock, Giorgio Corani, James Weatherall, Gavin Brown

    There was a mistake in the proof of the optimal shrinkage intensity for our estimator presented in Section 3.1.

  • Correction to: Robust classification via MOM minimization
    Mach. Learn. (IF 2.672) Pub Date : 2020-06-03
    Guillaume Lecué, Matthieu Lerasle, Timothée Mathieu

    There is a mistake in one of the authors’ names (in both online and print versions of the article): it should be Timothée Mathieu instead of Timlothée Mathieu.

  • Anomaly detection with inexact labels
    Mach. Learn. (IF 2.672) Pub Date : 2020-05-31
    Tomoharu Iwata; Machiko Toyoda; Shotaro Tora; Naonori Ueda

    We propose a supervised anomaly detection method for data with inexact anomaly labels, where each label, which is assigned to a set of instances, indicates that at least one instance in the set is anomalous. Although many anomaly detection methods have been proposed, they cannot handle inexact anomaly labels. To measure the performance with inexact anomaly labels, we define the inexact AUC, which is

  • Transfer learning by mapping and revising boosted relational dependency networks
    Mach. Learn. (IF 2.672) Pub Date : 2020-05-11
    Rodrigo Azevedo Santos; Aline Paes; Gerson Zaverucha

    Statistical machine learning algorithms usually assume the availability of data of considerable size to train the models. However, they would fail in addressing domains where data is difficult or expensive to obtain. Transfer learning has emerged to address this problem of learning from scarce data by relying on a model learned in a source domain where data is easy to obtain to be a starting point

  • Robust classification via MOM minimization
    Mach. Learn. (IF 2.672) Pub Date : 2020-04-27
    Guillaume Lecué; Matthieu Lerasle; Timlothée Mathieu

    We present an extension of Chervonenkis and Vapnik’s classical empirical risk minimization (ERM) where the empirical risk is replaced by a median-of-means (MOM) estimator of the risk. The resulting new estimators are called MOM minimizers. While ERM is sensitive to corruption of the dataset for many classical loss functions used in classification, we show that MOM minimizers behave well in theory,

  • Engineering problems in machine learning systems
    Mach. Learn. (IF 2.672) Pub Date : 2020-04-23
    Hiroshi Kuwajima; Hirotoshi Yasuoka; Toshihiro Nakae

    Fatal accidents are a major issue hindering the wide acceptance of safety-critical systems that employ machine learning and deep learning models, such as automated driving vehicles. In order to use machine learning in a safety-critical system, it is necessary to demonstrate the safety and security of the system through engineering processes. However, thus far, no such widely accepted engineering concepts

  • Learning from positive and unlabeled data: a survey
    Mach. Learn. (IF 2.672) Pub Date : 2020-04-02
    Jessa Bekker; Jesse Davis

    Learning from positive and unlabeled data or PU learning is the setting where a learner only has access to positive examples and unlabeled data. The assumption is that the unlabeled data can contain both positive and negative examples. This setting has attracted increasing interest within the machine learning literature as this type of data naturally arises in applications such as medical diagnosis

  • Classification using proximity catch digraphs
    Mach. Learn. (IF 2.672) Pub Date : 2020-03-31
    Artür Manukyan; Elvan Ceyhan

    We employ random geometric digraphs to construct semi-parametric classifiers. These data-random digraphs belong to parameterized random digraph families called proximity catch digraphs (PCDs). A related geometric digraph family, class cover catch digraph (CCCD), has been used to solve the class cover problem by using its approximate minimum dominating set and showed relatively good performance in the

  • Discovering subjectively interesting multigraph patterns
    Mach. Learn. (IF 2.672) Pub Date : 2020-03-16
    Sarang Kapoor; Dhish Kumar Saxena; Matthijs van Leeuwen

    Over the past decade, network analysis has attracted substantial interest because of its potential to solve many real-world problems. This paper lays the conceptual foundation for an application in aviation, through focusing on the discovery of patterns in multigraphs (graphs in which multiple edges can be present between vertices). Our main contributions are twofold. Firstly, we propose a novel subjective

  • Detecting anomalous packets in network transfers: investigations using PCA, autoencoder and isolation forest in TCP
    Mach. Learn. (IF 2.672) Pub Date : 2020-03-12
    Mariam Kiran; Cong Wang; George Papadimitriou; Anirban Mandal; Ewa Deelman

    Large-scale scientific workflows rely heavily on high-performance file transfers. These transfers require strict quality parameters such as guaranteed bandwidth, no packet loss or data duplication. To have successful file transfers, methods such as predetermined thresholds and statistical analysis need to be done to determine abnormal patterns. Network administrators routinely monitor and analyze network

  • Classification with costly features as a sequential decision-making problem
    Mach. Learn. (IF 2.672) Pub Date : 2020-02-28
    Jaromír Janisch; Tomáš Pevný; Viliam Lisý

    This work focuses on a specific classification problem, where the information about a sample is not readily available, but has to be acquired for a cost, and there is a per-sample budget. Inspired by real-world use-cases, we analyze average and hard variations of a directly specified budget. We postulate the problem in its explicit formulation and then convert it into an equivalent MDP, that can be

  • Joint maximization of accuracy and information for learning the structure of a Bayesian network classifier
    Mach. Learn. (IF 2.672) Pub Date : 2020-02-28
    Dan Halbersberg; Maydan Wienreb; Boaz Lerner

    Although recent studies have shown that a Bayesian network classifier (BNC) that maximizes the classification accuracy (i.e., minimizes the 0/1 loss function) is a powerful tool in both knowledge representation and classification, this classifier: (1) focuses on the majority class and, therefore, misclassifies minority classes; (2) is usually uninformative about the distribution of misclassifications;

  • Scalable Bayesian preference learning for crowds
    Mach. Learn. (IF 2.672) Pub Date : 2020-02-06
    Edwin Simpson; Iryna Gurevych

    We propose a scalable Bayesian preference learning method for jointly predicting the preferences of individuals as well as the consensus of a crowd from pairwise labels. Peoples’ opinions often differ greatly, making it difficult to predict their preferences from small amounts of personal data. Individual biases also make it harder to infer the consensus of a crowd when there are few labels per item

  • Sparse hierarchical regression with polynomials
    Mach. Learn. (IF 2.672) Pub Date : 2020-01-24
    Dimitris Bertsimas; Bart Van Parys

    We present a novel method for sparse polynomial regression. We are interested in that degree r polynomial which depends on at most k inputs, counting at most \(\ell\) monomial terms, and minimizes the sum of the squares of its prediction errors. Such highly structured sparse regression was denoted by Bach (Advances in neural information processing systems, pp 105–112, 2009) as sparse hierarchical regression

  • Improving coordination in small-scale multi-agent deep reinforcement learning through memory-driven communication
    Mach. Learn. (IF 2.672) Pub Date : 2020-01-23
    Emanuele Pesce, Giovanni Montana

    Deep reinforcement learning algorithms have recently been used to train multiple interacting agents in a centralised manner whilst keeping their execution decentralised. When the agents can only acquire partial observations and are faced with tasks requiring coordination and synchronisation skills, inter-agent communication plays an essential role. In this work, we propose a framework for multi-agent

  • Conditional density estimation and simulation through optimal transport
    Mach. Learn. (IF 2.672) Pub Date : 2020-01-13
    Esteban G. Tabak; Giulio Trigila; Wenjun Zhao

    A methodology to estimate from samples the probability density of a random variable x conditional to the values of a set of covariates \(\{z_{l}\}\) is proposed. The methodology relies on a data-driven formulation of the Wasserstein barycenter, posed as a minimax problem in terms of the conditional map carrying each sample point to the barycenter and a potential characterizing the inverse of this map

  • High-dimensional model recovery from random sketched data by exploring intrinsic sparsity
    Mach. Learn. (IF 2.672) Pub Date : 2020-01-07
    Tianbao Yang; Lijun Zhang; Qihang Lin; Shenghuo Zhu; Rong Jin

    Learning from large-scale and high-dimensional data still remains a computationally challenging problem, though it has received increasing interest recently. To address this issue, randomized reduction methods have been developed by either reducing the dimensionality or reducing the number of training instances to obtain a small sketch of the original data. In this paper, we focus on recovering a high-dimensional

  • Model-based kernel sum rule: kernel Bayesian inference with probabilistic models
    Mach. Learn. (IF 2.672) Pub Date : 2020-01-02
    Yu Nishiyama; Motonobu Kanagawa; Arthur Gretton; Kenji Fukumizu

    Kernel Bayesian inference is a principled approach to nonparametric inference in probabilistic graphical models, where probabilistic relationships between variables are learned from data in a nonparametric manner. Various algorithms of kernel Bayesian inference have been developed by combining kernelized basic probabilistic operations such as the kernel sum rule and kernel Bayes’ rule. However, the

  • Improved graph-based SFA: information preservation complements the slowness principle
    Mach. Learn. (IF 2.672) Pub Date : 2019-12-26
    Alberto N. Escalante-B.; Laurenz Wiskott

    Slow feature analysis (SFA) is an unsupervised learning algorithm that extracts slowly varying features from a multi-dimensional time series. SFA has been extended to supervised learning (classification and regression) by an algorithm called graph-based SFA (GSFA). GSFA relies on a particular graph structure to extract features that preserve label similarities. Processing of high dimensional input

  • On cognitive preferences and the plausibility of rule-based models
    Mach. Learn. (IF 2.672) Pub Date : 2019-12-24
    Johannes Fürnkranz; Tomáš Kliegr; Heiko Paulheim

    It is conventional wisdom in machine learning and data mining that logical models such as rule sets are more interpretable than other models, and that among such rule-based models, simpler models are more interpretable than more complex ones. In this position paper, we question this latter assumption by focusing on one particular aspect of interpretability, namely the plausibility of models. Roughly

  • Distributed block-diagonal approximation methods for regularized empirical risk minimization
    Mach. Learn. (IF 2.672) Pub Date : 2019-12-18
    Ching-pei Lee; Kai-Wei Chang

    In recent years, there is a growing need to train machine learning models on a huge volume of data. Therefore, designing efficient distributed optimization algorithms for empirical risk minimization (ERM) has become an active and challenging research topic. In this paper, we propose a flexible framework for distributed ERM training through solving the dual problem, which provides a unified description

Contents have been reproduced by permission of the publishers.
Springer 纳米技术权威期刊征稿
ACS ES&T Engineering
ACS ES&T Water
ACS Publications填问卷