当前期刊: Advances in Data Analysis and Classification Go to current issue    加入关注   
显示样式:        排序: IF: - GO 导出
我的关注
我的收藏
您暂时未登录!
登录
  • Robust archetypoids for anomaly detection in big functional data
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-08-03
    Guillermo Vinue, Irene Epifanio

    Archetypoid analysis (ADA) has proven to be a successful unsupervised statistical technique to identify extreme observations in the periphery of the data cloud, both in classical multivariate data and functional data. However, two questions remain open in this field: the use of ADA for outlier detection and its scalability. We propose to use robust functional archetypoids and adjusted boxplot to pinpoint

    更新日期:2020-08-04
  • Adaptive sparse group LASSO in quantile regression
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-07-29
    Alvaro Mendez-Civieta, M. Carmen Aguilera-Morillo, Rosa E. Lillo

    This paper studies the introduction of sparse group LASSO (SGL) to the quantile regression framework. Additionally, a more flexible version, an adaptive SGL is proposed based on the adaptive idea, this is, the usage of adaptive weights in the penalization. Adaptive estimators are usually focused on the study of the oracle property under asymptotic and double asymptotic frameworks. A key step on the

    更新日期:2020-07-30
  • Hierarchical conceptual clustering based on quantile method for identifying microscopic details in distributional data
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-07-22
    Kadri Umbleja, Manabu Ichino, Hiroyuki Yaguchi

    Symbolic data is aggregated from bigger traditional datasets in order to hide entry specific details and to enable analysing large amounts of data, like big data, which would otherwise not be possible. Symbolic data may appear in many different but complex forms like intervals and histograms. Identifying patterns and finding similarities between objects is one of the most fundamental tasks of data

    更新日期:2020-07-22
  • On the use of quantile regression to deal with heterogeneity: the case of multi-block data
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-07-19
    Cristina Davino, Rosaria Romano, Domenico Vistocco

    The aim of the paper is to propose a quantile regression based strategy to assess heterogeneity in a multi-block type data structure. Specifically, the paper deals with a particular data structure where several blocks of variables are observed on the same units and a structure of relations is assumed between the different blocks. The idea is that quantile regression complements the results of the least

    更新日期:2020-07-20
  • A bias-variance analysis of state-of-the-art random forest text classifiers
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-07-19
    Thiago Salles, Leonardo Rocha, Marcos Gonçalves

    Random forest (RF) classifiers do excel in a variety of automatic classification tasks, such as topic categorization and sentiment analysis. Despite such advantages, RF models have been shown to perform poorly when facing noisy data, commonly found in textual data, for instance. Some RF variants have been proposed to provide better generalization capabilities under such challenging scenario, including

    更新日期:2020-07-20
  • Active learning of constraints for weighted feature selection
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-07-10
    Samah Hijazi, Denis Hamad, Mariam Kalakech, Ali Kalakech

    Pairwise constraints, a cheaper kind of supervision information that does not need to reveal the class labels of data points, were initially suggested to enhance the performance of clustering algorithms. Recently, researchers were interested in using them for feature selection. However, in most current methods, pairwise constraints are provided passively and generated randomly over multiple algorithmic

    更新日期:2020-07-10
  • A stochastic block model for interaction lengths
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-06-18
    Riccardo Rastelli, Michael Fop

    We propose a new stochastic block model that focuses on the analysis of interaction lengths in dynamic networks. The model does not rely on a discretization of the time dimension and may be used to analyze networks that evolve continuously over time. The framework relies on a clustering structure on the nodes, whereby two nodes belonging to the same latent group tend to create interactions and non-interactions

    更新日期:2020-06-18
  • Regime dependent interconnectedness among fuzzy clusters of financial time series
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-06-16
    Giovanni De Luca, Paola Zuccolotto

    We analyze the dynamic structure of lower tail dependence coefficients within groups of assets defined such that assets belonging to the same group are characterized by pairwise high associations between extremely low values. The groups are identified by means of a fuzzy cluster analysis algorithm. The tail dependence coefficients are estimated using the Joe–Clayton copula function, and the 75th percentile

    更新日期:2020-06-16
  • M-estimators and trimmed means: from Hilbert-valued to fuzzy set-valued data
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-06-12
    Beatriz Sinova, Stefan Van Aelst, Pedro Terán

    Different approaches to robustly measure the location of data associated with a random experiment have been proposed in the literature, with the aim of avoiding the high sensitivity to outliers or data changes typical for the mean. In particular, M-estimators and trimmed means have been studied in general spaces, and can be used to handle Hilbert-valued data. Both alternatives are of interest due to

    更新日期:2020-06-12
  • ParticleMDI: particle Monte Carlo methods for the cluster analysis of multiple datasets with applications to cancer subtype identification
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-06-12
    Nathan Cunningham, Jim E. Griffin, David L. Wild

    We present a novel nonparametric Bayesian approach for performing cluster analysis in a context where observational units have data arising from multiple sources. Our approach uses a particle Gibbs sampler for inference in which cluster allocations are jointly updated using a conditional particle filter within a Gibbs sampler, improving the mixing of the MCMC chain. We develop several approaches to

    更新日期:2020-06-12
  • Isotonic boosting classification rules
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-06-12
    David Conde, Miguel A. Fernández, Cristina Rueda, Bonifacio Salvador

    In many real classification problems a monotone relation between some predictors and the classes may be assumed when higher (or lower) values of those predictors are related to higher levels of the response. In this paper, we propose new boosting algorithms, based on LogitBoost, that incorporate this isotonicity information, yielding more accurate and easily interpretable rules. These algorithms are

    更新日期:2020-06-12
  • Chained correlations for feature selection
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-06-09
    Ludwig Lausser, Robin Szekely, Hans A. Kestler

    Data-driven algorithms stand and fall with the availability and quality of existing data sources. Both can be limited in high-dimensional settings (\(n \gg m\)). For example, supervised learning algorithms designed for molecular pheno- or genotyping are restricted to samples of the corresponding diagnostic classes. Samples of other related entities, such as arise in differential diagnosis, are usually

    更新日期:2020-06-09
  • The ultrametric correlation matrix for modelling hierarchical latent concepts
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-05-28
    Carlo Cavicchia, Maurizio Vichi, Giorgia Zaccaria

    Many relevant multidimensional phenomena are defined by nested latent concepts, which can be represented by a tree-structure supposing a hierarchical relationship among manifest variables. The root of the tree is a general concept which includes more specific ones. The aim of the paper is to reconstruct an observed data correlation matrix of manifest variables through an ultrametric correlation matrix

    更新日期:2020-05-28
  • Data generation for composite-based structural equation modeling methods
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-05-26
    Rainer Schlittgen, Marko Sarstedt, Christian M. Ringle

    Examining the efficacy of composite-based structural equation modeling (SEM) features prominently in research. However, studies analyzing the efficacy of corresponding estimators usually rely on factor model data. Thereby, they assess and analyze their performance on erroneous grounds (i.e., factor model data instead of composite model data). A potential reason for this malpractice lies in the lack

    更新日期:2020-05-26
  • Simultaneous dimension reduction and clustering via the NMF-EM algorithm
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-05-25
    Léna Carel, Pierre Alquier

    Mixture models are among the most popular tools for clustering. However, when the dimension and the number of clusters is large, the estimation of the clusters become challenging, as well as their interpretation. Restriction on the parameters can be used to reduce the dimension. An example is given by mixture of factor analyzers for Gaussian mixtures. The extension of MFA to non-Gaussian mixtures is

    更新日期:2020-05-25
  • Mixtures of Dirichlet-Multinomial distributions for supervised and unsupervised classification of short text data
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-05-25
    Laura Anderlucci, Cinzia Viroli

    Topic detection in short textual data is a challenging task due to its representation as high-dimensional and extremely sparse document-term matrix. In this paper we focus on the problem of classifying textual data on the base of their (unique) topic. For unsupervised classification, a popular approach called Mixture of Unigrams consists in considering a mixture of multinomial distributions over the

    更新日期:2020-05-25
  • Clustering discrete-valued time series
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-05-20
    Tyler Roick, Dimitris Karlis, Paul D. McNicholas

    There is a need for the development of models that are able to account for discreteness in data, along with its time series properties and correlation. Our focus falls on INteger-valued AutoRegressive (INAR) type models. The INAR type models can be used in conjunction with existing model-based clustering techniques to cluster discrete-valued time series data. With the use of a finite mixture model

    更新日期:2020-05-20
  • Gaussian mixture modeling and model-based clustering under measurement inconsistency
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-05-12
    Shuchismita Sarkar, Volodymyr Melnykov, Rong Zheng

    Finite mixtures present a powerful tool for modeling complex heterogeneous data. One of their most important applications is model-based clustering. It assumes that each data group can be reasonably described by one mixture model component. This establishes a one-to-one relationship between mixture components and clusters. In some cases, however, this relationship can be broken due to the presence

    更新日期:2020-05-12
  • Semiparametric mixtures of regressions with single-index for model based clustering
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-04-23
    Sijia Xiang, Weixin Yao

    In this article, we propose two classes of semiparametric mixture regression models with single-index for model based clustering. Unlike many semiparametric/nonparametric mixture regression models that can only be applied to low dimensional predictors, the new semiparametric models can easily incorporate high dimensional predictors into the nonparametric components. The proposed models are very general

    更新日期:2020-04-23
  • Mixture modeling of data with multiple partial right-censoring levels
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-04-21
    Semhar Michael, Tatjana Miljkovic, Volodymyr Melnykov

    In this paper, a new flexible approach to modeling data with multiple partial right-censoring points is proposed. This method is based on finite mixture models, flexible tool to model heterogeneity in data. A general framework to accommodate partial censoring is considered. In this setting, it is assumed that a certain portion of data points are censored and the rest are not. This situation occurs

    更新日期:2020-04-21
  • Kappa coefficients for dichotomous-nominal classifications
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-04-07
    Matthijs J. Warrens

    Two types of nominal classifications are distinguished, namely regular nominal classifications and dichotomous-nominal classifications. The first type does not include an ‘absence’ category (for example, no disorder), whereas the second type does include an ‘absence’ category. Cohen’s unweighted kappa can be used to quantify agreement between two regular nominal classifications with the same categories

    更新日期:2020-04-20
  • A cost-sensitive constrained Lasso
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-03-12
    Rafael Blanquero, Emilio Carrizosa, Pepa Ramírez-Cobo, M. Remedios Sillero-Denamiel

    The Lasso has become a benchmark data analysis procedure, and numerous variants have been proposed in the literature. Although the Lasso formulations are stated so that overall prediction error is optimized, no full control over the accuracy prediction on certain individuals of interest is allowed. In this work we propose a novel version of the Lasso in which quadratic performance constraints are added

    更新日期:2020-04-20
  • Data projections by skewness maximization under scale mixtures of skew-normal vectors
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-03-10
    Jorge M. Arevalillo, Hilario Navarro

    Multivariate scale mixtures of skew-normal distributions are flexible models that account for the non-normality of data by means of a tail weight parameter and a shape vector representing the asymmetry of the model in a directional fashion. Its stochastic representation involves a skew-normal vector and a non negative mixing scalar variable, independent of the skew-normal vector, that injects tail

    更新日期:2020-04-20
  • A novel semi-supervised support vector machine with asymmetric squared loss
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-03-10
    Huimin Pei, Qiang Lin, Liran Yang, Ping Zhong

    Laplacian support vector machine (LapSVM), which is based on the semi-supervised manifold regularization learning framework, performs better than the standard SVM, especially for the case where the supervised information is insufficient. However, the use of hinge loss leads to the sensitivity of LapSVM to noise around the decision boundary. To enhance the performance of LapSVM, we present a novel semi-supervised

    更新日期:2020-04-20
  • Count regression trees
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2019-05-10
    Nan-Ting Liu, Feng-Chang Lin, Yu-Shan Shih

    Count data frequently appear in many scientific studies. In this article, we propose a regression tree method called CORE for analyzing such data. At each node, besides a Poisson regression, a count regression such as hurdle, negative binomial, or zero-inflated regression which can accommodate over-dispersion and/or excess zeros is fitted. A likelihood-based procedure is suggested to select split variables

    更新日期:2020-04-20
  • A fragmented-periodogram approach for clustering big data time series
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2019-06-14
    Jorge Caiado, Nuno Crato, Pilar Poncela

    We propose and study a new frequency-domain procedure for characterizing and comparing large sets of long time series. Instead of using all the information available from data, which would be computationally very expensive, we propose some regularization rules in order to select and summarize the most relevant information for clustering purposes. Essentially, we suggest to use a fragmented periodogram

    更新日期:2020-04-20
  • Data clustering based on principal curves
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2019-06-11
    Elson Claudio Correa Moraes, Danton Diego Ferreira, Giovani Bernardes Vitor, Bruno Henrique Groenner Barbosa

    In this contribution we present a new method for data clustering based on principal curves. Principal curves consist of a nonlinear generalization of principal component analysis and may also be regarded as continuous versions of 1D self-organizing maps. The proposed method implements the k-segment algorithm for principal curves extraction. Then, the method divides the principal curves into two or

    更新日期:2020-04-20
  • Classification using sequential order statistics
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2019-08-07
    Alexander Katzur, Udo Kamps

    Whereas discrimination methods and their error probabilities were broadly investigated for common data distributions such as the multivariate normal or t-distributions, this paper considers the case when the recorded data are assumed to be observations from sequential order statistics. Random vectors of sequential order statistics describe, e.g., successive failures in a k-out-of-n system or in other

    更新日期:2020-04-20
  • Learning a metric when clustering data points in the presence of constraints
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2019-05-16
    Ahmad Ali Abin, Mohammad Ali Bashiri, Hamid Beigy

    Learning an appropriate distance measure under supervision of side information has become a topic of significant interest within machine learning community. In this paper, we address the problem of metric learning for constrained clustering by considering three important issues: (1) considering importance degree for constraints, (2) preserving the topological structure of data, and (3) preserving some

    更新日期:2020-04-20
  • How well do SEM algorithms imitate EM algorithms? A non-asymptotic analysis for mixture models
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2019-07-10
    Johannes Blömer, Sascha Brauer, Kathrin Bujna, Daniel Kuntze

    In this paper, we present a theoretical and an experimental comparison of EM and SEM algorithms for different mixture models. The SEM algorithm is a stochastic variant of the EM algorithm. The qualitative intuition behind the SEM algorithm is simple: If the number of observations is large enough, then we expect that an update step of the stochastic SEM algorithm is similar to the corresponding update

    更新日期:2020-04-20
  • Ensemble of optimal trees, random forest and random projection ensemble classification
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2019-06-12
    Zardad Khan, Asma Gul, Aris Perperoglou, Miftahuddin Miftahuddin, Osama Mahmoud, Werner Adler, Berthold Lausen

    The predictive performance of a random forest ensemble is highly associated with the strength of individual trees and their diversity. Ensemble of a small number of accurate and diverse trees, if prediction accuracy is not compromised, will also reduce computational burden. We investigate the idea of integrating trees that are accurate and diverse. For this purpose, we utilize out-of-bag observations

    更新日期:2020-04-20
  • Clustering genomic words in human DNA using peaks and trends of distributions
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2019-05-31
    Ana Helena Tavares, Jakob Raymaekers, Peter J. Rousseeuw, Paula Brito, Vera Afreixo

    In this work we seek clusters of genomic words in human DNA by studying their inter-word lag distributions. Due to the particularly spiked nature of these histograms, a clustering procedure is proposed that first decomposes each distribution into a baseline and a peak distribution. An outlier-robust fitting method is used to estimate the baseline distribution (the ‘trend’), and a sparse vector of detrended

    更新日期:2020-04-20
  • Optimal arrangements of hyperplanes for SVM-based multiclass classification
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2019-07-26
    Víctor Blanco, Alberto Japón, Justo Puerto

    In this paper, we present a novel SVM-based approach to construct multiclass classifiers by means of arrangements of hyperplanes. We propose different mixed integer (linear and non linear) programming formulations for the problem using extensions of widely used measures for misclassifying observations where the kernel trick can be adapted to be applicable. Some dimensionality reductions and variable

    更新日期:2020-04-20
  • Efficient regularized spectral data embedding
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-02-24
    Lazhar Labiod, Mohamed Nadif

    Data embedding (DE) or dimensionality reduction techniques are particularly well suited to embedding high-dimensional data into a space that in most cases will have just two dimensions. Low-dimensional space, in which data samples (data points) can more easily be visualized, is also often used for learning methods such as clustering. Sometimes, however, DE will identify dimensions that contribute little

    更新日期:2020-04-20
  • A combination of k -means and DBSCAN algorithm for solving the multiple generalized circle detection problem
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-02-12
    Rudolf Scitovski, Kristian Sabo

    Motivated by the problem of identifying rod-shaped particles (e.g. bacilliform bacterium), in this paper we consider the multiple generalized circle detection problem. We propose a method for solving this problem that is based on center-based clustering, where cluster-centers are generalized circles. An efficient algorithm is proposed which is based on a modification of the well-known k-means algorithm

    更新日期:2020-04-20
  • A robust spatial autoregressive scalar-on-function regression with t -distribution
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-01-29
    Tingting Huang, Gilbert Saporta, Huiwen Wang, Shanshan Wang

    Modelling functional data in the presence of spatial dependence is of great practical importance as exemplified by applications in the fields of demography, economy and geography, and has received much attention recently. However, for the classical scalar-on-function regression (SoFR) with functional covariates and scalar responses, only a relatively few literature is dedicated to this relevant area

    更新日期:2020-04-20
  • From-below Boolean matrix factorization algorithm based on MDL
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2020-01-08
    Tatiana Makhalova, Martin Trnecka

    During the past few years Boolean matrix factorization (BMF) has become an important direction in data analysis. The minimum description length principle (MDL) was successfully adapted in BMF for the model order selection. Nevertheless, a BMF algorithm performing good results w.r.t. standard measures in BMF is missing. In this paper, we propose a novel from-below Boolean matrix factorization algorithm

    更新日期:2020-04-20
  • Interval forecasts based on regression trees for streaming data
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2019-12-18
    Xin Zhao, Stuart Barber, Charles C. Taylor, Zoka Milan

    In forecasting, we often require interval forecasts instead of just a specific point forecast. To track streaming data effectively, this interval forecast should reliably cover the observed data and yet be as narrow as possible. To achieve this, we propose two methods based on regression trees: one ensemble method and one method based on a single tree. For the ensemble method, we use weighted results

    更新日期:2020-04-20
  • Modelling heterogeneity: on the problem of group comparisons with logistic regression and the potential of the heterogeneous choice model
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2019-12-13
    Gerhard Tutz

    The comparison of coefficients of logit models obtained for different groups is widely considered as problematic because of possible heterogeneity of residual variances in latent variables. It is shown that the heterogeneous logit model can be used to account for this type of heterogeneity by considering reduced models that are identified. A model selection strategy is proposed that can distinguish

    更新日期:2020-04-20
  • Mixtures of skewed matrix variate bilinear factor analyzers
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2019-11-21
    Michael P. B. Gallaugher, Paul D. McNicholas

    In recent years, data have become increasingly higher dimensional and, therefore, an increased need has arisen for dimension reduction techniques for clustering. Although such techniques are firmly established in the literature for multivariate data, there is a relative paucity in the area of matrix variate, or three-way, data. Furthermore, the few methods that are available all assume matrix variate

    更新日期:2019-11-21
  • From here to infinity: sparse finite versus Dirichlet process mixtures in model-based clustering.
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2019-04-23
    Sylvia Frühwirth-Schnatter,Gertraud Malsiner-Walli

    In model-based clustering mixture models are used to group data points into clusters. A useful concept introduced for Gaussian mixtures by Malsiner Walli et al. (Stat Comput 26:303-324, 2016) are sparse finite mixtures, where the prior distribution on the weight distribution of a mixture with K components is chosen in such a way that a priori the number of clusters in the data is random and is allowed

    更新日期:2019-11-01
  • Ensemble of a subset of kNN classifiers.
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2018-01-01
    Asma Gul,Aris Perperoglou,Zardad Khan,Osama Mahmoud,Miftahuddin Miftahuddin,Werner Adler,Berthold Lausen

    Combining multiple classifiers, known as ensemble methods, can give substantial improvement in prediction performance of learning algorithms especially in the presence of non-informative features in the data sets. We propose an ensemble of subset of kNN classifiers, ESkNN, for classification task in two steps. Firstly, we choose classifiers based upon their individual performance using the out-of-sample

    更新日期:2019-11-01
  • Improved initialisation of model-based clustering using Gaussian hierarchical partitions.
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2016-03-08
    Luca Scrucca,Adrian E Raftery

    Initialisation of the EM algorithm in model-based clustering is often crucial. Various starting points in the parameter space often lead to different local maxima of the likelihood function and, so to different clustering partitions. Among the several approaches available in the literature, model-based agglomerative hierarchical clustering is used to provide initial partitions in the popular mclust

    更新日期:2019-11-01
  • Assessing and accounting for time heterogeneity in stochastic actor oriented models.
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2011-10-18
    Joshua A Lospinoso,Michael Schweinberger,Tom A B Snijders,Ruth M Ripley

    This paper explores time heterogeneity in stochastic actor oriented models (SAOM) proposed by Snijders (Sociological Methodology. Blackwell, Boston, pp 361-395, 2001) which are meant to study the evolution of networks. SAOMs model social networks as directed graphs with nodes representing people, organizations, etc., and dichotomous relations representing underlying relationships of friendship, advice

    更新日期:2019-11-01
  • Gaussian parsimonious clustering models with covariates and a noise component
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2019-09-20
    Keefe Murphy, Thomas Brendan Murphy

    We consider model-based clustering methods for continuous, correlated data that account for external information available in the presence of mixed-type fixed covariates by proposing the MoEClust suite of models. These models allow different subsets of covariates to influence the component weights and/or component densities by modelling the parameters of the mixture as functions of the covariates.

    更新日期:2019-09-20
  • A robust approach to model-based classification based on trimming and constraints
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2019-08-14
    Andrea Cappozzo, Francesca Greselin, Thomas Brendan Murphy

    In a standard classification framework a set of trustworthy learning data are employed to build a decision rule, with the final aim of classifying unlabelled units belonging to the test set. Therefore, unreliable labelled observations, namely outliers and data with incorrect labels, can strongly undermine the classifier performance, especially if the training size is small. The present work introduces

    更新日期:2019-08-14
  • Seemingly unrelated clusterwise linear regression
    Adv. Data Anal. Classif. (IF 1.603) Pub Date : 2019-08-12
    Giuliano Galimberti, Gabriele Soffritti

    Linear regression models based on finite Gaussian mixtures represent a flexible tool for the analysis of linear dependencies in multivariate data. They are suitable for dealing with correlated response variables when data come from a heterogeneous population composed of two or more sub-populations, each of which is characterised by a different linear regression model. Several types of finite mixtures

    更新日期:2019-08-12
Contents have been reproduced by permission of the publishers.
导出
全部期刊列表>>
欢迎访问IOP中国网站
自然职场线上招聘会
GIANT
产业、创新与基础设施
自然科研线上培训服务
材料学研究精选
胸腔和胸部成像专题
屿渡论文,编辑服务
何川
苏昭铭
陈刚
姜涛
李闯创
李刚
北大
隐藏1h前已浏览文章
课题组网站
新版X-MOL期刊搜索和高级搜索功能介绍
ACS材料视界
天合科研
x-mol收录
上海纽约大学
陈芬儿
厦门大学
何振宇
史大永
吉林大学
卓春祥
张昊
杨中悦
试剂库存
down
wechat
bug