The three-way-in and three-way-out framework to treat and exploit ambiguity in data
Section snippets
Introduction and related works
Recently Machine Learning (ML) has continuously attracted the interest of the research community, both from a mathematical-theoretical point of view and, more predominantly, from an application point of view. This interest has been stimulated by the fact that different research communities (e.g. health-care and medicine, finance and economics, …) have acknowledged the ubiquity of uncertainty, in different forms, e.g. vagueness, randomness, ambiguity, as an intrinsic part of their practice [14],
Basic notions
In this section, we give the mathematical background on decision tables, orthopairs and orthopartions that will be used in the following. Definition 2.1 A multi–observer decision table is a tuple where U is a universe of objects of interest; A is a set of attributes (or features) that we use to represent objects in U. In particular, we define each attribute as a function where is the domain of values that the attribute a can assume; is a distinguished decision attribute, that we assume
Three–way output
In this section, we will describe in greater detail the Three-way Out (TWO) learning setting. In this context, the chosen data representation, i.e., the selected features and/or their level of granularity, is such that a form of c-ambiguity arises. Thus, the chosen data representation does not allow us to distinguish different objects that are either identical or “too near” in the sample space, similarly to the concept of indiscernibility in standard Pawlak's rough sets [36] or generalized
Three–way input
While Three–Way Output denotes a single phenomenon, i.e., a classifier emitting a set–valued classification, with Three–Way Input (TWI) we denote two different phenomena leading to a form of r-ambiguity, i.e., the presence of set–valued values in the training set (which is the input to the learning process):
- 1.
The target attribute d is set–valued, this setting is also called learning from imprecise/partial labels [9], [25]: a set-valued classification can be seen as a partial abstention of the
Experimental validation and discussion
In order to assess the validity and efficacy of the proposed algorithms, in both the Three-way Out and Three-way In learning settings, we performed two sets of experimental validations: in Section 5.1 we report the experimental setting and obtained results in the Three-way Out case, while in Section 5.2 we report the same information for the Three-way In case.
Conclusions
In this article, we studied the ambiguity occurring in Machine Learning, from a twofold perspective: both as a problem affecting the input of the learning process, and as a potential resource to make the output of classifiers apter for sound human decision making.
In particular, we presented techniques to represent and manage this type of uncertainty in the training data that is fed into the learning algorithm, what we called Three-way In), and also techniques to represent ambiguity and
Declaration of Competing Interest
The authors confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome.
References (50)
- et al.
Orthopartitions and soft clustering: soft mutual information measures for clustering validation
Knowl.-Based Syst.
(2019) - et al.
Weak supervision and other non-standard classification problems
Pattern Recognit. Lett.
(2016) - et al.
Structured approximations as a basis for three-way decisions in rough set theory
Knowl.-Based Syst.
(2019) Learning from imprecise and fuzzy observations: data disambiguation through generalized loss minimization
Int. J. Approx. Reason.
(2014)- et al.
Cost-sensitive sequential three-way decision modeling using a deep neural network
Int. J. Approx. Reason.
(2017) - et al.
Multi-objective attribute reduction in three-way decision-theoretic rough set model
Int. J. Approx. Reason.
(2019) - et al.
Multi-attribute group decision-making method based on multi-granulation weights and three-way decisions
Int. J. Approx. Reason.
(2020) - et al.
Rough sets: some extensions
Inf. Sci.
(2007) - et al.
Generalized multi-granulation double-quantitative decision-theoretic rough set of multi-source information system
Int. J. Approx. Reason.
(2019) - et al.
The transferable belief model
Artif. Intell.
(1994)