当前期刊: Knowledge-Based Systems Go to current issue    加入关注   
显示样式:        排序: 导出
我的关注
我的收藏
您暂时未登录!
登录
  • An attention-guided and prior-embedded approach with multi-task learning for shadow detection
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-20
    Shihui Zhang; He Li; Weihang Kong; Xiaowei Zhang; Weidong Ren

    Shadow detection is a fundamental and challenging task, requiring understanding accurately the visual semantic context of the shadow region and backgrounds. In this paper, we propose an attention-guided and prior-embedded approach with multi-task learning for shadow detection task. Different from most existing works, we introduce the effective multi-task learning into this target detection task to add the high-level prior into the detection process, instead of using the pertained weighting network as the front-end module and complex recurrent network. Especially, we also employ a channel attention-guided module to complement the high-level feature and low-level feature. Moreover, for the proposed approach with multi-task learning, we design the weighted loss function for effective training. Experimental results on two public available benchmarks demonstrate our approach achieves competitive results than the existing typical shadow detection approaches.

    更新日期:2020-01-21
  • A cooperative coevolution algorithm for multi-objective fuzzy distributed hybrid flow shop
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-20
    Jie Zheng; Ling Wang; Jing-jing Wang

    With consideration of uncertainty in the distributed manufacturing systems, this paper addresses a multi-objective fuzzy distributed hybrid flow shop scheduling problem with fuzzy processing times and fuzzy due dates. To optimize the fuzzy total tardiness and robustness simultaneously, a cooperative coevolution algorithm with problem-specific strategies is proposed by reasonably combining the estimation of distribution algorithm (EDA) and the iterated greedy (IG) search. In the EDA-mode search, a problem-specific probability model is established to reduce the solution space and a sample mechanism is proposed to generate new individuals. To enhance exploitation, a specific local search is designed to improve performance of non-dominated solutions. Moreover, destruction and reconstruction methods in the IG-mode search are employed for further exploiting better solutions. To balance exploration and exploitation capabilities, a cooperation scheme for mode switching is designed based on the information entropy and the diversity of elite solutions. The effect of the key parameters on the performances of the proposed algorithm is investigated by Taguchi design of experiment method. Comparative results and statistical analysis demonstrate the effectiveness of the proposed algorithm in solving the problem.

    更新日期:2020-01-21
  • GEV-NN: A deep neural network architecture for class imbalance problem in binary classification
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-18
    Lkhagvadorj Munkhdalai; Tsendsuren Munkhdalai; Keun Ho Ryu

    Class imbalance is a common issue in many applications such as medical diagnosis, fraud detection, web advertising, etc. Although standard deep learning method has achieved remarkably high-performance on datasets with balanced classes, its ability to classify imbalanced dataset is still limited. This paper proposes a novel end-to-end deep neural network architecture and adopts Gumbel distribution as an activation function in neural networks for class imbalance problem in the application of binary classification. Our proposed architecture, named GEV-NN, consists of three components: the first component serves to score input variables to determine a set of suitable input, the second component is an auto-encoder that learns efficient explanatory features for the minority class, and in the last component, the combination of the scored input and extracted features are then used to make the final prediction. We jointly optimize these components in an end-to-end training. Extensive experiments using real-world imbalanced datasets showed that GEV-NN significantly outperforms the state-of-the-art baselines by around 2% at most. In addition, the GEV-NN gives a beneficial advantage to interpret variable importance. We find key risk factors for hypertension, which are consistent with other scientific researches, using the first component of GEV-NN.

    更新日期:2020-01-21
  • Semantics of soft sets and three-way decision with soft sets
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-18
    Jilin Yang; Yiyu Yao

    The theory of three-way decision provides an effective tool for decision-making under uncertainty and incomplete information, when a two-way decision is difficult to make. Soft sets conceptualize and represent special types of uncertainty. In this paper, we suggest two plausible semantics of soft sets for modeling uncertainty, namely, a multi-context semantics and a possible-world semantics. Under the two semantics, we investigate three-way decision under uncertainty represented by soft sets. We introduce a qualitative model of three-way decision based on the core and support of a soft set. We formulate a quantitative model by a) transforming a soft set into a fuzzy set, b) transforming the resulting fuzzy set into a shadowed set (i.e., a three-way approximation of the fuzzy set), and c) making three-way decision with the shadowed set. The results bring additional insights into soft sets, fuzzy sets, interval sets, and shadowed sets for decision making under uncertainty.

    更新日期:2020-01-21
  • Obtaining accurate estimated action values in categorical distributional reinforcement learning
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-18
    Yingnan Zhao; Peng Liu; Chenjia Bai; Wei Zhao; Xianglong Tang

    Categorical Distributional Reinforcement Learning (CDRL) uses a categorical distribution with evenly spaced outcomes to model the entire distribution of returns and produces state-of-the-art empirical performance. However, using inappropriate bounds with CDRL may generate inaccurate estimated action values, which affect the policy update step and the final performance. In CDRL, the bounds of the distribution indicate the range of the action values that the agent can obtain in one task, without considering the policy’s performance and state–action pairs. The action values that the agent obtains are often far from the bounds, and this reduces the accuracy of the estimated action values. This paper describes a method of obtaining more accurate estimated action values for CDRL using adaptive bounds. This approach enables the bounds of the distribution to be adjusted automatically based on the policy and state–action pairs. To achieve this, we save the weights of the critic network over a fixed number of time steps, and then apply a bootstrapping method. In this way, we can obtain confidence intervals for the upper and lower bound, and then use the upper and lower bound of these intervals as the new bounds of the distribution. The new bounds are more appropriate for the agent and provide a more accurate estimated action value. To further correct the estimated action values, a distributional target policy is proposed as a smoothing method. Experiments show that our method outperforms many state-of-the-art methods on the OpenAI gym tasks.

    更新日期:2020-01-21
  • Modeling and multi-neighborhood iterated greedy algorithm for distributed hybrid flow shop scheduling problem
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-18
    Weishi Shao; Zhongshi Shao; Dechang Pi

    As economic globalization, large manufacturing enterprises build production centers in different places to maximize profit. Therefore, scheduling problems among multiple production centers should be considered. This paper studies a distributed hybrid flow shop scheduling problem (DHFSP) with makespan criterion, which combines the characteristic of distributed flow shop scheduling and parallel machine scheduling. In the DHFSP, a set of jobs are assigned into a set of identical factories to process. Each job needs to be through same route with a set of stages, and each stage has several machines in parallel and at least one of stage has more than one machine. For solving the DHFSP, this paper proposes two algorithms: DNEH with smallest-medium rule and multi-neighborhood iterated greedy algorithm. The DNEH with smallest-medium rule constructive heuristic first generates a seed sequence by decomposition and smallest-medium rule, and then uses a greedy iteration to assign jobs to factories. In the iterated greedy algorithm, a multi-search construction is proposed, which applies the greedy insertion to the factory again after inserting a new job. Then, a multi-neighborhood local search is utilized to enhance local search ability. The proposed algorithms are evaluated by a comprehensive comparison, and the experimental results demonstrate that the proposed algorithms are very competitive for solving the DHFSP.

    更新日期:2020-01-21
  • A region division based decomposition approach for evolutionary many-objective optimization
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-17
    Ruochen Liu; Jin Liu; Runan Zhou; Cheng Lian; Renyu Bian

    A region division based decomposition approach for evolutionary many-objective optimization (denoted as RD-EMO) is proposed in this paper. In the proposed RD-EMO, a set of reference points are generated and the objective space is divided into a set of regions through angle bisectors between adjacent reference lines. Then two attributions of regions are defined, which are region degree and region sparse rate, respectively. Region attributions based select operator is designed to choose solutions in sparse regions of objective space as mating solutions so that new solutions created by mating solutions can be located in sparser regions. In addition, region sparse rate is also applied to the population update process so that solutions in sparse regions of objective space are reserved and those in dense regions are discarded. Hence, two attributions of regions can better guarantee population diversity. Moreover, those solutions with better scalar function values are reserved in the same intensity regions so that population convergence is also guaranteed. In the study of the performance of the proposed algorithm, the performance comparison of RD-EMO with some state-of-the-art algorithms including NSGA-III, MOEA/D-PBI, MOEA/DD, RVEA and MOEA/D-M2M in solving a set of well-known multi-objective optimization problems (MOPs) having 3 to 15 objectives shows that the proposed RD-EMO is superior in converging to approximate Pareto Front (PF) with a standout distribution. We also apply it to solve nine many-objective 0/1 knapsack problems (MKPs), with a good performance obtained.

    更新日期:2020-01-17
  • Batch allocation for decomposition-based complex task crowdsourcing e-markets in social networks
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-17
    Jiuchuan Jiang; Yifeng Zhou; Yichuan Jiang; Zhan Bu; Jie Cao

    In existing studies on decomposition-based complex task crowdsourcing e-markets, a complex task is first decomposed into a flow of simple subtasks and then the decomposed subtasks are allocated independently to different individual workers. However, such retail-style independent allocation of decomposed subtasks costs much time and the intermediate results of subtasks cannot be utilized by each other; moreover, the independent allocation does not consider the cooperation among assigned workers and the time-dependency relations among subtasks. To solve such a problem, this paper presents a novel batch allocation approach for decomposition-based complex task crowdsourcing in social networks, in which the similar subtasks of complex tasks are integrated into a batch that will be allocated to the same workers. In the presented approach, it is preferable that a batch of subtasks will be allocated to the workers within the same group or the workers with closer relations in a social network; moreover, the allocation will consider the time constraints of subtasks so that the deadlines of the whole complex tasks can be satisfied. This batch allocation optimization problem is proved to be NP-hard. Then, two types of heuristic approaches are designed: the lateral approach that does not consider the subordination relationship between subtasks and complex tasks and the longitudinal approach that considers such relationships. The experiments on real-world crowdsourcing datasets show that the two presented heuristic approaches outperform traditional retail-style allocation approach in terms of total payment by requesters, average income of assigned workers, cooperation efficiency of assigned workers, and task allocation time.

    更新日期:2020-01-17
  • Constrained bilinear factorization multi-view subspace clustering
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-16
    Qinghai Zheng; Jihua Zhu; Zhiqiang Tian; Zhongyu Li; Shanmin Pang; Xiuyi Jia

    Multi-view clustering is an important and fundamental problem. Many multi-view subspace clustering methods have been proposed, and most of them assume that all views share a same coefficient matrix. However, the underlying information of multi-view data are not fully exploited under this assumption, since the coefficient matrices of different views should have the same clustering properties rather than be uniform among multiple views. To this end, this paper proposes a novel Constrained Bilinear Factorization Multi-view Subspace Clustering (CBF-MSC) method. Specifically, the bilinear factorization with an orthonormality constraint and a low-rank constraint is imposed for all coefficient matrices to make them have the same trace-norm instead of being equivalent, so as to explore the consensus information of multi-view data more fully. Finally, an Augmented Lagrangian Multiplier (ALM) based algorithm is designed to optimize the objective function. Comprehensive experiments tested on nine benchmark datasets validate the effectiveness and competitiveness of the proposed approach compared with several state-of-the-arts.

    更新日期:2020-01-17
  • Detection of SQL injection based on artificial neural network
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-16
    Peng Tang; Weidong Qiu; Zheng Huang; Huijuan Lian; Guozhen Liu

    The SQL injection, a common web attack, has been a challenging network security issue which causes annually millions of dollars of financial loss worldwide as well as a large amount of users’ privacy data leakage. This work presents a high accuracy SQL injection detection method based on neural network. We first acquire authentic user URL access log data from the Internet Service Provider(ISP), ensuring that our approach is real, effective and practical. We then conduct statistical research on normal data and SQL injection data. Based on the statistical results, we design eight types of features and train an MLP model. The accuracy of the model maintains over 99%. Meanwhile, we compare and evaluate the training effect of other machine learning algorithms(LSTM, for example), the results reveal that the accuracy of our method is superior to the relevant machine learning algorithms.

    更新日期:2020-01-17
  • Aedes mosquito detection in its larval stage using deep neural networks
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-07-19
    Antonio Arista-Jalife; Mariko Nakano; Zaira Garcia-Nonoal; Daniel Robles-Camarillo; Hector Perez-Meana; Heriberto Antonio Arista-Viveros

    Dengue, Chikungunya and Zika viruses cause dangerous infections in tropical and subtropical regions throughout the world. The World Health Organization estimates that one out of every three persons in the entire human population is in danger of contracting one of these diseases from a single mosquito bite. Currently, these viral infections are not preventable by vaccines and there is not a direct treatment that can effectively diminish the viral infection, which causes a wide range of pathologies, including severe joint pain, internal blood loss, permanent neurological damage in unborn children and even death. Due to this grim scenario, the best and maybe the only line of defense against these diseases is the effective surveillance, control and suppression of the mosquitoes that transmit these viruses: Aedes aegypti and Aedes albopictus. In this paper, we present a complete solution that is capable of identifying the Aedes aegypti and Aedes albopictus mosquito in the larval stage, which is easily disposable, restricted to water bodies, and incapable of transmitting diseases according to the Centers for Disease Control and Prevention (CDC). Our proposal is based on deep neural networks (DNN) that effectively recognize larval samples with an accuracy of 94.19%, which is better than other state-of-the-art automatic methods. Additionally, the capabilities of our proposed DNN allow us to automatically crop the region of interest (ROI) with an accuracy of 92.85% and then automatically classify the region as Aedes positive or Aedes negative, without further human intervention and in less than a second, accelerating the response time for biological control from days to seconds. Our proposal includes hardware designs that allow inexpensive implementation, making it suitable for isolated communities, underdeveloped countries, and rural or urban areas.

    更新日期:2020-01-16
  • Mining distinct and contiguous sequential patterns from large vehicle trajectories
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-09-28
    Luke Bermingham; Ickjai Lee

    We focus on the problem of using contiguous SPM to extract succinct, redundancy controlled patterns from large vehicle trajectories. Although there exist several techniques to reduce the contiguous sequential pattern output such as closed and max SPM, they still produce massive redundant pattern outputs when the input sequence database is sufficiently large and homogeneous — as is often the case for vehicle trajectories. Therefore, in this work we propose DC-SPAN: a distinct contiguous SPM algorithm. DC-SPAN mines a set of sequential patterns where the maximum redundancy of the pattern output is controlled by a user-specified parameter. Through various experiments using real world trajectory datasets we show DC-SPAN effectively controls the redundancy of the pattern output with trade-offs in pattern distinctness. Additionally, our experiments also indicate that DC-SPAN efficiently computes these patterns, incurring only a marginal running time cost over existing state-of-the-art contiguous SPM approaches. Lastly, due to the less redundant and more succinct pattern output we also briefly explore visualisation as a useful technique to interpret the discovered vehicle routes.

    更新日期:2020-01-16
  • Consistency and consensus-driven models to personalize individual semantics of linguistic terms for supporting group decision making with distribution linguistic preference relations
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-04
    Xiaoan Tang; Zhanglin Peng; Qiang Zhang; Witold Pedrycz; Shanlin Yang

    Distribution linguistic preference relations (DLPRs) that model linguistic expressions with the aid of probabilistic distributions of multiple linguistic terms provide an effective tool to accurately elicit the preferences of decision makers (DMs) in linguistic decisions. Meanwhile, numerical scale models have been suitable choices for DMs to handle computing with words when solving linguistic decision problems. This study focuses on improving the group decision making (GDM) with DLPRs via the help of numerical scale models by filling the following gap. It is obvious that words might exhibit different meanings for different people. DMs may have a varying understanding of a given linguistic term in real-world fuzzy linguistic GDM. Setting personalized semantics of the linguistic terms for each DM becomes a critical task in GDM with DLPRs. To do this, we first define an improved numerical scale model to facilitate the linkages between DLPRs and numerical fuzzy preference relations. Then an additive consistency and a multiplicative consistency of DLPRs are analyzed, and the corresponding consistency indices are provided to measure the consistency levels of DLPRs. Based on them, we develop two consistency-driven optimization models to personalize numerical scales for linguistic terms with individual DLPRs. Next, we develop an approach for addressing GDM with DLPRs. In the proposed approach, a dissimilarity-based consensus measure is designed. To determine a group numerical scale for the linguistic terms with the corresponding group DLPR, two consistency and consensus-driven optimization models are constructed. Finally, illustrative examples are analyzed using the proposed approach to demonstrate its applicability and validity.

    更新日期:2020-01-16
  • DISL: Deep Isomorphic Substructure Learning for network representations
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-09
    Shicheng Cui; Tao Li; Shu-Ching Chen; Mei-Ling Shyu; Qianmu Li; Hong Zhang

    The analysis of complex networks based on deep learning has drawn much attention recently. Generally, due to the scale and complexity of modern networks, traditional methods are gradually losing the analytic efficiency and effectiveness. Therefore, it is imperative to design a network analysis model which caters to the massive amount of data and learns more comprehensive information from networks. In this paper, we propose a novel model, namely Deep Isomorphic Substructure Learning (DISL) model, which aims to learn network representations from patterns with isomorphic substructures. Specifically, in DISL, deep learning techniques are used to learn a better network representation for each vertex (node). We provide the method that makes the isomorphic units self-embed into vertex-based subgraphs whose explicit topologies are extracted from raw graph-structured data, and design a Probability-guided Random Walk (PRW) procedure to explore the set of substructures. Sequential samples yielded by PRW provide the information of relational similarity, which integrates the information of correlation and co-occurrence of vertices and the information of substructural isomorphism of subgraphs. We maximize the likelihood of the preserved relationships for learning the implicit similarity knowledge. The architecture of the Convolutional Neural Networks (CNNs) is redesigned for simultaneously processing the explicit and implicit features to learn a more comprehensive representation for networks. The DISL model is applied to several vertex classification tasks for social networks. Our results show that DISL outperforms the challenging state-of-the-art Network Representation Learning (NRL) baselines by a significant margin on accuracy and weighted-F1 scores over the experimental datasets.

    更新日期:2020-01-16
  • Interactive double states emotion cell model for textual dialogue emotion prediction
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-09
    Dayu Li; Yang Li; Suge Wang

    Daily dialogues are full of emotions that control the trends of dialogues and influence the attitudes of interlocutors toward each other, and understanding the human emotions in dialogues is of great significance in emotional comfort, human–computer interaction and intelligent question-answering. This paper defines a new task called emotion prediction in textual dialogue. Different from the text emotion recognition task, which derives the current emotional state of interlocutor from the utterance, emotion prediction aims at predicting the future emotional state of interlocutor before the interlocutor utters something. Moreover, this paper summarizes and explains three notable characteristics of emotional propagation in text dialogue: context dependence, persistence and contagiousness. By considering these characteristics, a fully data-driven interactive double states emotion cell model (IDS-ECM) is proposed. The model has two layers. The first layer automatically extracts the emotional information of historical dialogue and is used to describe the contextual dependence of the textual dialogue emotion. The second layer models the change process of interlocutors’ emotional states during the dialogue and depicts the persistence and contagiousness of emotions. Experimental results on two manually annotated datasets show that the proposed model is superior to the baseline in the macro-averaged F1 evaluation metric and that the proposed model can simulate the emotional changes in the process of dialogue so as to predict the emotions with high accuracy. The experimental results also reveal the communication differences between different emotional categories in dialogue, which is of guiding significance for future research.

    更新日期:2020-01-16
  • Managing minority opinions in micro-grid planning by a social network analysis-based large scale group decision making method with hesitant fuzzy linguistic information
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-09-24
    Ruxue Ren; Ming Tang; Huchang Liao

    The growth of global electricity demand has put forward higher requirements for power distribution networks. The high cost of the large-scale power system and the voice for the use of renewable energy impel the birth of the micro-grid which plays a complementary role in the power generation of large-scale power system. The construction of micro-grid planning is complex and many stakeholders’ opinions should be considered for a comprehensive evaluation. Furthermore, the development of social big data techniques, such as e-marketplace and e-democracy, makes experts have social relationships among them. This study aims to develop a consensus model to manage minority opinions for large-scale group decision making with social network analysis for micro-grid planning. To deal with the vague and uncertain features in complex micro-grid planning problems, experts are supposed to use hesitant fuzzy linguistic term sets to express their opinions. A social network analysis-based clustering method is introduced to classify experts. Besides, in a large-scale group decision making problem, the opinions of experts should be fully considered, especially the minority opinions. This model considers the minority opinions in a micro-grid planning problem and provides an approach to manage these opinions. Finally, we use an illustrative example concerning the micro-grid planning decision making in Ali district in Tibet to demonstrate the effectiveness and practicability of the proposed model.

    更新日期:2020-01-16
  • Parameter tuning for meta-heuristics
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-10
    Susheel Kumar Joshi; Jagdish Chand Bansal
    更新日期:2020-01-16
  • A new fast search algorithm for exact k-nearest neighbors based on optimal triangle-inequality-based check strategy
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-09
    Yiwei Pan; Zhibin Pan; Yikun Wang; Wei Wang

    The k-nearest neighbor (KNN) algorithm has been widely used in pattern recognition, regression, outlier detection and other data mining areas. However, it suffers from the large distance computation cost, especially when dealing with big data applications. In this paper, we propose a new fast search (FS) algorithm for exact k-nearest neighbors based on optimal triangle-inequality-based (OTI) check strategy. During the procedure of searching exact k-nearest neighbors for any query, the OTI check strategy can eliminate more redundant distance computations for the instances located in the marginal area of neighboring clusters compared with the original TI check strategy. Considering the large space complexity and extra time complexity of OTI, we also propose an efficient optimal triangle-inequality-based (EOTI) check strategy. The experimental results demonstrate that our proposed two algorithms (OTI and EOTI) achieve the best performance compared with other related KNN fast search algorithms, especially in the case of dealing with high-dimensional datasets.

    更新日期:2020-01-16
  • Incremental updating approximations for double-quantitative decision-theoretic rough sets with the variation of objects
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-04
    Yanting Guo; Eric C.C. Tsang; Meng Hu; Xuxin Lin; Degang Chen; Weihua Xu; Binbin Sang

    Double-quantitative decision-theoretic rough sets (Dq-DTRS) provide more comprehensive description methods for rough approximations of concepts, which lay foundations for the development of attribute reduction and rule extraction of rough sets. Existing researches on concept approximations of Dq-DTRS pay more attention to the equivalence class of each object in approximating a concept, and calculate concept approximations from the whole data set in a batch. This makes the calculation of approximations time consuming in dynamic data sets. In this paper, we first analyze the variations of equivalence classes, decision classes, conditional probability, internal grade and external grade in dynamic data sets while objects vary sequentially or simultaneously over time. Then we propose the updating mechanisms for the concept approximations of two types of Dq-DTRS models from incremental perspective in dynamic decision information systems with the sequential and batch variations of objects. Meanwhile, we design incremental sequential insertion, sequential deletion, batch insertion, batch deletion algorithms for two Dq-DTRS models. Finally, we present experimental comparisons showing the feasibility and efficiency of the proposed incremental approaches in calculating approximations and the stability of the incremental updating algorithms from the perspective of the runtime under different inserting and deleting ratios and parameter values.

    更新日期:2020-01-16
  • Cascaded dual-scale crossover network for hyperspectral image classification
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-14
    Feilong Cao; Wenhui Guo

    In recent years, deep neural networks have exhibited numerous advantages in hyperspectral image classification (HIC). However, owing to the limited number of training samples of hyperspectral images (HSIs), the network structure should not be designed too deep to retard the overfitting phenomenon. This study proposes a cascaded dual-scale crossover network for HIC, which not only could extract rich features, but also does not make the network deeper. It continuously connects two different cascaded dual-scale crossover blocks, and automatically extracts the spectral–spatial features of HSIs. Moreover, for the limited training samples, the proposed network could flexibly capture more discriminant contextual features by using different spectral-size and spatial-size convolution kernels. Furthermore, two different cross-merge methods are designed to improve the information flow and contrast of the images to obtain parts of interest for the images. Two skip structures are also used for alleviating overfitting and accelerating the network training. Additional experimental results on some datasets, including Indian Pines, Kennedy Space Center, and University of Pavia, verify the feasibility of the proposed network. Namely, the classification accuracy of the proposed network is superior to that of other existing networks.

    更新日期:2020-01-16
  • An approach to generalizing the handling of preferences in argumentation-based decision-making systems
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-14
    Juan C.L. Teze; Sebastian Gottifredi; Alejandro J. García; Guillermo R. Simari

    As a practical mechanism for formalizing commonsense reasoning, argumentation has shown its potential for applications in diverse areas, many related to decision-making in knowledge-based systems. Following this line, and for helping users in making a better and informed decision, different recommender systems proposals have been developed in the argumentation literature. We will use recommender systems as a good example where to exercise our proposal. In particular, the role of preference criterion in argumentation-based recommender systems which is used to compare competing arguments is central to the user’s query answering process where if the criterion does not adjust to the represented domain, the system could fail by being undecided too often. Therefore, having tools that allow to select and change the argument comparison mechanism has to be used become a central issue. Argumentation-based recommender systems that offer these tools provide an interesting ability that can be used for improving the reasoning capabilities in this type of systems. This work introduces an approach to handle multiple argument preference criteria in argumentation-based recommender systems and general knowledge-based decision support systems. More precisely, the proposal allows changing the information that a criterion can use in the argument comparison process and specify how several criteria can be simultaneously used in such process as well; to achieve that goal, a set of operators to combine several criteria is presented. The knowledge representation and reasoning is performed in Defeasible Logic Programming, a defeasible argumentation formalism based on logic programming.

    更新日期:2020-01-16
  • F-Mapper: A Fuzzy Mapper clustering algorithm
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-11
    Quang-Thinh Bui; Bay Vo; Hoang-Anh Nguyen Do; Nguyen Quoc Viet Hung; Vaclav Snasel

    Using topology in data analysis, known as Topological Data Analysis (TDA), is now a promising new area of data mining research. One of the important and foundational tools of TDA is the Mapper algorithm. During the past two decades, this algorithm has proven its useful and robust abilities in extracting insights and meaningful information from high-dimensional datasets. Nevertheless, several alterations in the choices of parameters, such as lens, cover and clustering, can be used to develop this algorithm. In this paper, we propose the F-Mapper algorithm, based on the foundation of the Mapper algorithm, to solve the problem of automating when dividing cover intervals with an arbitrary percentage of overlap. To clarify the efficiency of this enhanced algorithm, experiments were carried out on three datasets, including the Unit Circle, Reaven and Miller Diabetes, and NKI Breast Cancer. The experimental results will be analyzed and compared with those of the original method, the Mapper algorithm, through the output image and silhouette coefficient score in the evaluation of clustering.

    更新日期:2020-01-16
  • MapReduce based improved quick reduct algorithm with granular refinement using vertical partitioning scheme
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-14
    Pandu Sowkuntla; P.S.V.S. Sai Prasad

    In the last few decades, rough sets have evolved to become an essential technology for feature subset selection by way of reduct computation in categorical decision systems. In recent years with the proliferation of MapReduce for distributed/parallel algorithms, several scalable reduct computation algorithms have been developed in this field for large-scale decision systems using MapReduce. The existing MapReduce based reduct computation approaches use horizontal partitioning (division in object space) of the dataset into the nodes of the cluster, requiring a complicated shuffle and sort phase. In this work, we propose an algorithm MR_IQRA_VP which is designed using vertical partitioning (division in attribute space) of the dataset with a simplified shuffle and sort phase of the MapReduce framework. MR_IQRA_VP is a distributed/parallel implementation of the Improved Quick Reduct Algorithm (IQRA_IG) and is implemented using iterative MapReduce framework of Apache Spark. We have done an extensive comparative study through experimentation on benchmark decision systems using existing horizontal partitioning based reduct computation algorithms. Through experimental analysis, along with theoretical validation, we have established that MR_IQRA_VP is suitable and scalable to datasets of larger size attribute space and moderate object space prevalent in the areas of Bioinformatics and Web mining.

    更新日期:2020-01-16
  • Granular structure-based incremental updating for multi-label classification
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-16
    Yuanjian Zhang; Duoqian Miao; Witold Pedrycz; Tianna Zhao; Jianfeng Xu; Ying Yu

    Incremental learning is an efficient computational paradigm of acquiring approximate knowledge of data in dynamic environment. Most of the research focuses on knowledge updating for single-label classification, whereas incremental mechanism for multi-label classification is of preliminary nature. This leads to considerable computation complexity to maintain desired performance. To address this challenge, we formulate a granular structure system (GSS). The proposed granular structure system in bottom-up way provides a systematic view on label-specific based classification. We demonstrate that the three-way selective ensemble (TSEN) model, a state-of-the-art solution for multi-label classification, is compatible with GSS in granulation. An incremental mechanism of GSS is introduced for both label-specific feature generation and optimization, and an incremental three-way selective ensemble algorithm for multiple instances immigration (IMOTSEN) is presented. Experiments completed on six datasets show that the proposed algorithm can maintain considerable classification performance while significantly accelerating the knowledge (GSS) updating.

    更新日期:2020-01-16
  • A T1OWA fuzzy linguistic aggregation methodology for searching feature-based opinions
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-18
    Jesus Serrano-Guerrero; Francisco Chiclana; Jose A. Olivas; Francisco P. Romero; Elmina Homapour

    Online services such as Amazon, Tripadvisor, Ebay, etc., allow users to express sentiments about different products or services. Not only that, in some cases it is also possible to express sentiments about the different features characterizing those products or services. Most users express sentiments about individual features by using numerical values, which sometimes do not allow users to reflect properly what they are meaning and therefore they are misleading. To overcome this key issue and make users’ opinions in online services more comprehensive, a new methodology for representing sentiments using linguistic term sets instead of numerical values is presented. In addition, this methodology will allow to implement importance degrees on the different features characterizing users’ opinions. From both sentiments and importance of the features, the most important opinions for each user is derived via an aggregation step based on the Type-1 Ordered Weighted Averaging (T1OWA) operator, which is able to aggregate the corresponding fuzzy set representations of linguistic terms. Furthermore, the final output of the T1OWA based-search process can easily be interpreted by users because it is always of the same type (fuzzy) and defined in the same domain of the original fuzzy linguistic labels. A case study is presented where the T1OWA operator methodology is used to assess different opinions according to different user profiles.

    更新日期:2020-01-16
  • On representation of fuzzy measures for learning Choquet and Sugeno integrals
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-18
    Gleb Beliakov; Dmitriy Divakov

    This paper examines the marginal contribution representation of fuzzy measures, used to construct fuzzy measure from empirical data through an optimization process. We show that the number of variables can be drastically reduced, and the constraints simplified by using an alternative representation. This technique makes optimizing fitting criteria more efficient numerically, and allows one to tackle learning problems with higher number of correlated decision criteria.

    更新日期:2020-01-16
  • Assignment of attribute weights with belief distributions for MADM under uncertainties
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-14
    Mi Zhou; Xin-Bao Liu; Yu-Wang Chen; Xiao-Fei Qian; Jian-Bo Yang; Jian Wu

    Multiple attribute decision making (MADM) problems often consist of various types of quantitative and qualitative attributes. Quantitative attributes can be assessed by accurate numerical values, interval values or fuzzy numbers, while qualitative attributes can be evaluated by belief distributions, linguistic variables or intuitionistic fuzzy sets. However, the determination of attribute weights is still an open issue in MADM problems until now. In the traditional objective weight assignment method, attributes are usually assessed by accurate values. In this paper, an entropy weight assignment method is proposed to dealing with the situation where the assessment of attributes can contain uncertainties, e.g., interval values, or contain both uncertainties and incompleteness, e.g., belief distributions. The advantage of the proposed method lies in that uncertainties and incompleteness contained in the interval numerical values or belief distributions can be preserved in the generated weights. Specifically, several pairs of programming models to generate the weights of attributes are constructed in three different circumstances: (1) quantitative attribute expressed by interval values; (2) incomplete belief distribution with accurate belief degrees; and (3) belief distribution constituted by interval belief degrees. The evidential reasoning approach is then utilized to aggregate the distributions of attributes based on the generated attribute weights. The normalized interval weight vector is defined, and the characteristics of the weight assignment method are discussed. The proposed method has been experimented with real data to illustrate its advantages and the potential in supporting MADM with uncertain and incomplete information.

    更新日期:2020-01-16
  • Multi-graph fusion for multi-view spectral clustering
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-18
    Zhao Kang; Guoxin Shi; Shudong Huang; Wenyu Chen; Xiaorong Pu; Joey Tianyi Zhou; Zenglin Xu

    A panoply of multi-view clustering algorithms has been developed to deal with prevalent multi-view data. Among them, spectral clustering-based methods have drawn much attention and demonstrated promising results recently. Despite progress, there are still two fundamental questions that stay unanswered to date. First, how to fuse different views into one graph. More often than not, the similarities between samples may be manifested differently by different views. Many existing algorithms either simply take the average of multiple views or just learn a common graph. These simple approaches fail to consider the flexible local manifold structures of all views. Hence, the rich heterogeneous information is not fully exploited. Second, how to learn the explicit cluster structure. Most existing methods do not pay attention to the quality of the graphs and perform graph learning and spectral clustering separately. Those unreliable graphs might lead to suboptimal clustering results. To fill these gaps, in this paper, we propose a novel multi-view spectral clustering model which performs graph fusion and spectral clustering simultaneously. The fusion graph approximates the original graph of each individual view but maintains an explicit cluster structure. Experiments on four widely used data sets confirm the superiority of the proposed method.

    更新日期:2020-01-16
  • A two-stage dynamic influence model-achieving decision-making consensus within large scale groups operating with incomplete information
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-18
    Shengli Li; Cuiping Wei

    The decision making environment has been dramatically affected by rapid developments in society and the economy. Large-scale group decision making (LSGDM) based on social network has become a vital research topic in the field of decision making science. In this paper, we propose a novel framework based on social network to manage the consensus reaching process (CRP) for LSGDM faced with incomplete information. In this framework, the large-scale group is first classified into several smaller sub-groups using a sub-group detection algorithm, based on the social network. Then, we propose an estimating method based on a collaborative filtering algorithm for estimating the missing preference information of the opinion leaders in each sub-group. The two-stage dynamic influence model for handling the consensus reaching process in LSGDM begins when the LSGDM is transformed into several smaller sub-group decision processes. In the first stage, a consensus model, based on opinion evolution, is proposed for the CRP within each sub-group. In the second stage, we consider each sub-group as a decision making unit. By focusing on the consensus problem across the sub-groups, we develop a novel opinion-leaders feedback strategy in order to help the sub-groups revise their opinions, working toward consensus. We provide an example of an application of our process to illustrate the validity of the proposed model for managing the CRP in LSGDM.

    更新日期:2020-01-16
  • CSAN: A neural network benchmark model for crime forecasting in spatio-temporal scale
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-19
    Qi Wang; Guangyin Jin; Xia Zhao; Yanghe Feng; Jincai Huang

    Understanding the evolving discipline of crime situations is a long-standing but significant problem. Former methods prefer the stochastic modeling of the crime phenomenon in physics or statistical equations, which are elegant in theoretical explanations but less efficient in real applications. Recently, some data-driven models, especially neural network models, are illustrating promising performance in capturing dynamics of the complex phenomenon, and available massive dataset enables the task-beneficial information utilization. However, there exist several difficulties in regional crime situation awareness, including the high dimensionality, the intractable correlations as well as information redundancies in spatio-temporal dataset. To achieve efficient information processing and disentangle relationships from a recent crime dataset of fifteen years, we construct the crime situation awareness network (CSAN) as a new benchmark forecasting model via integrating structures of variational auto-encoders and context-based sequence generative neural network. Final experiments demonstrate that CSAN mostly outperforms other commonly-used spatio-temporal forecasting algorithms, such as Conv-LSTM, in regional multi-type crime frequency prediction.

    更新日期:2020-01-16
  • Deep learning approaches for anomaly-based intrusion detection systems: A survey, taxonomy, and open issues
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-16
    Arwa Aldweesh; Abdelouahid Derhab; Ahmed Z. Emam

    The massive growth of data that are transmitted through a variety of devices and communication protocols have raised serious security concerns, which have increased the importance of developing advanced intrusion detection systems (IDSs). Deep learning is an advanced branch of machine learning, composed of multiple layers of neurons that represent the learning process. Deep learning can cope with large-scale data and has shown success in different fields. Therefore, researchers have paid more attention to investigating deep learning for intrusion detection. This survey comprehensively reviews and compares the key previous deep learning-focused cybersecurity surveys. Through an extensive review, this survey provides a novel fine-grained taxonomy that categorizes the current state-of-the-art deep learning-based IDSs with respect to different facets, including input data, detection, deployment, and evaluation strategies. Each facet is further classified according to different criteria. This survey also compares and discusses the related experimental solutions proposed as deep learning-based IDSs. By analysing the experimental studies, this survey discusses the role of deep learning in intrusion detection, the impact of intrusion detection datasets, and the efficiency and effectiveness of the proposed approaches. The findings demonstrate that further effort is required to improve the current state-of-the art. Finally, open research challenges are identified, and future research directions for deep learning-based IDSs are recommended.

    更新日期:2020-01-16
  • Cost-sensitive semi-supervised selective ensemble model for customer credit scoring
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-16
    Jin Xiao; Xu Zhou; Yu Zhong; Ling Xie; Xin Gu; Dunhu Liu

    Only a few customers can be labeled in realistic credit-scoring problems, while many other customers cannot. Further, satisfactory performance is difficult, as traditional supervised learning methods can only use labeled samples to build credit-scoring models. Semi-supervised learning (SSL) can use both labeled and unlabeled samples to solve this problem, but existing credit-scoring research has primarily constructed single semi-supervised models. This study introduces SSL, cost-sensitive learning, a group method of data handling (GMDH), and an ensemble learning technique to propose a GMDH-based cost-sensitive semi-supervised selective ensemble (GCSSE) model. This involves two stages: (1)First, train an ensemble model composed of N base classifiers on the initial training set L with class labels, use it to selectively label the samples from the dataset U without class labels, add them with their predicted labels to the training set, and update the N base classifiers on the new training set; (2)Second, classify L and the test set using the respective trained base classifiers, and construct a cost-sensitive GMDH neural network to obtain the selective ensemble classification results for the test set. Experimental comparisons of five public customer credit score datasets and an empirical analysis of a real customer credit score dataset suggest that this model exhibits the best overall credit-scoring performance compared with one supervised ensemble model and three semi-supervised ensemble models.

    更新日期:2020-01-16
  • Active learning through label error statistical methods
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-24
    Min Wang; Ke Fu; Fan Min; Xiuyi Jia

    Clustering-based active learning splits data into a number of blocks and queries the labels of the most critical instances. An active learner must decide how to choose these critical instances and how to split the blocks. In this paper, we present theoretical and practical statistical methods for analyzing the relationship between the label error and the neighbor radius, and design new split and selection strategies to handle these two issues. First, we define statistical functions for the label error based on a single instance and instance pairs. Second, we build practical statistical models, calculate empirical label errors, and guide the block splitting process. Third, using these practical models, we develop a center-and-edge instance selection strategy for choosing critical instances. Fourth, we design a new algorithm called active learning through label error statistical methods (ALSE). Learning experiments were performed with 20 datasets from various domains. The results of significance tests verify the effectiveness of ALSE and its superiority over state-of-the-art active learning algorithms.

    更新日期:2020-01-16
  • HDSM: A distributed data mining approach to classifying vertically distributed data streams
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-14
    Benjamin Denham; Russel Pears; M. Asif Naeem

    The rise in the Internet of Things (IoT) and other sensor networks has created many vertically-distributed and high-velocity data streams that require specialized algorithms for true distributed data mining. This paper proposes a novel Hierarchical Distributed Stream Miner (HDSM) that learns relationships between the features of separate data streams with minimal data transmission to central locations. Experimental evaluation demonstrates significant improvements in classification accuracy over previously proposed distributed stream-mining approaches while minimizing data transmission and computational costs. HDSM’s potential for dynamically trading off accuracy with computational resource costs is also demonstrated.

    更新日期:2020-01-16
  • Discovering process models for the analysis of application failures under uncertainty of event logs
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-10
    Antonio Pecchia; Ingo Weber; Marcello Cinque; Yu Ma

    Computer applications, such as servers, databases and middleware, ubiquitously emit execution traces stored in log files. The use of logs for the analysis of application failures is known since the early days of computers. Field data studies have shown that application logs are fraught with uncertainty, i.e., missing or noisy events in the logs. A body of research that has dealt successfully with uncertainty in event logs is process mining from the business process management community, specifically by discovering process models. The literature has shown the value of process mining across several domains, but as yet there is no study that quantifies possible improvements from using process models, and the impact of uncertainty in the context of application failures. This work addresses the use of process mining for detecting failures from application logs. First, process models are discovered from logs; then conformance checking is used to detect deviations from the models. We contribute to knowledge engineering research with a systematic measurement study that quantifies the failure detection capability of conformance checking in spite of missing events, and its accuracy with respect to process models obtained from noisy logs. Analysis is done with a dataset of 55,462 execution traces from three independent real-life applications. We obtain a mixed answer depending on the application under test; our measurements provide insights into the use of process mining for failure analysis.

    更新日期:2020-01-16
  • Hybrid neural conditional random fields for multi-view sequence labeling
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-24
    Xuli Sun; Shiliang Sun; Minzhi Yin; Hao Yang

    In traditional machine learning, conditional random fields (CRF) is the mainstream probability model for sequence labeling problems. CRF considers the relation between adjacent labels other than decoding each label independently, and better performance is expected to achieve. However, there are few multi-view learning methods involving CRF which can be directly used for sequence labeling tasks. In this paper, we propose a novel multi-view CRF model to label sequential data, called MVCRF, which well exploits two principles for multi-view learning: consensus and complementary. We first use different neural networks to extract features from multiple views. Then, considering the consistency among the different views, we introduce a joint representation space for the extracted features and minimize the distance between the two views for regularization. Meanwhile, following the complementary principle, the features of multiple views are integrated into the framework of CRF. We train MVCRF in an end-to-end fashion and evaluate it on two benchmark data sets. The experimental results illustrate that MVCRF obtains state-of-the-art performance: F1 score 95.44% for chunking on CoNLL-2000, 95.06% for chunking and 96.99% for named entity recognition (NER) on CoNLL-2003.

    更新日期:2020-01-16
  • One-step Kernel Multi-view Subspace Clustering
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-17
    Guang-Yu Zhang; Yu-Ren Zhou; Xiao-Yu He; Chang-Dong Wang; Dong Huang

    Multi-view subspace clustering is essential to many scientific problems. However, most existing methods suffer from three aspects of issues. First, these methods usually adopt a two-step framework, lacking the ability to achieve an optimal common affinity matrix across multiple views. Second, these methods are intended to solve the clustering problem in linear subspaces but may fail in practice as most real-world data sets may exhibit non-linear structures. Third, most existing subspace-based methods force the negative elements in the coefficient matrix to be positive, which may damage the inherent correlation among the data. To address above issues, we propose a novel approach termed One-step Kernel Multi-view Subspace Clustering (OKMSC). The common affinity matrix is learned from all views under one-step framework, which integrates the nonnegative and discriminative property of affinity matrix into the computation. Further, a kernelized model is designed to address the nonlinear multi-view clustering problem. And an iterative optimization method is designed to solve the objective function in this model. Extensive experiments have validated the superiority of the proposed method over several state-of-art clustering methods.

    更新日期:2020-01-16
  • Deep learning approach on information diffusion in heterogeneous networks
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-29
    Soheila Molaei; Hadi Zare; Hadi Veisi

    There are many real-world complex systems with multi-type interacting entities that can be regarded as heterogeneous networks including human connections and biological evolutions. One of the main issues in such networks is to predict information diffusion such as shape, growth and size of social events and evolutions in the future. While there exist a variety of works on this topic mainly using a threshold-based approach, they suffer from the local viewpoint on the network and sensitivity to the threshold parameters. In this paper, information diffusion is considered through a latent representation learning of the heterogeneous networks to encode in a deep learning model. To this end, we propose a novel meta-path representation learning approach, Heterogeneous Deep Diffusion(HDD), to exploit meta-paths as main entities in networks. At first, the functional heterogeneous structures of the network are learned by a continuous latent representation through traversing meta-paths with the aim of global end-to-end viewpoint. Then, the well-known deep learning architectures are employed on our generated features to predict diffusion processes in the network. The proposed approach enables us to apply it on different information diffusion tasks such as topic diffusion and cascade prediction. We demonstrate the proposed approach on benchmark network datasets through the well-known evaluation measures. The experimental results show that our approach outperforms the earlier state-of-the-art methods.

    更新日期:2020-01-16
  • DeepLN: A framework for automatic lung nodule detection using multi-resolution CT screening images
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-17
    Xiuyuan Xu; Chengdi Wang; Jixiang Guo; Lan Yang; Hongli Bai; Weimin Li; Zhang Yi
    更新日期:2020-01-16
  • CNAVER: A Content and Network-based Academic VEnue Recommender system
    Knowl. Based Syst. (IF 5.101) Pub Date : 2019-10-17
    Tribikram Pradhan; Sukomal Pal

    The phenomenon of rapidly developing academic venues poses a significant challenge for researchers: how to recognize the ones that are not only in accordance with one’s scholarly interests but also of high significance? Often, even a high-quality paper is rejected because of a mismatch between the research area of the paper and the scope of the journal. Recommending appropriate scholarly venues to researchers empowers them to recognize and partake in important academic conferences and assists them in getting published in impactful journals. A venue recommendation system becomes helpful in this scenario, particularly when exploring a new field or when further choices are required. We propose CNAVER: A Content and Network-based Academic VEnue Recommender system. It provides an integrated framework employing a rank-based fusion of paper-paper peer network (PPPN) model and venue-venue peer network (VVPN) model. It only requires the title and abstract of a paper to provide venue recommendations, thus assisting researchers even at the earliest stage of paper writing. It also addresses cold start issues such as the involvement of an inexperienced researcher and a novel venue along with the problems of data sparsity, diversity, and stability. Experiments on the DBLP dataset exhibit that our proposed approach outperforms several state-of-the-art methods in terms of precision, nDCG, MRR, accuracy, F−measuremacro, average venue quality, diversity, and stability.

    更新日期:2020-01-16
  • Construction and exploitation of an historical knowledge graph to deal with the evolution of ontologies
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-16
    Silvio Domingos Cardoso; Marcos Da Silveira; Cédric Pruski

    With the advances of Artificial Intelligence, the need for annotated data increases. However, the quality of these annotations can be impacted by the evolution of domain knowledge since the relations between successive versions of ontologies are rarely described and the history of concepts is not kept at the ontology level. As a consequence, using datasets annotated at different times becomes a real challenge for data- and knowledge-intensive systems. This work presents a way to address this problem. We introduce a Historical Knowledge Graph (HKG), where information from previous versions of an ontology can be found inside a single graph, reducing storage space (no need for versioning) and data treatment time (no need for laborious analysis of each version of the ontology). The HKG proposed in this work represents the evolutionary aspects of the knowledge in a structural way. Examples of the applicability of an HKG for information retrieval and the maintenance of semantic annotations show the capability of our approach for improving the quality of existing techniques.

    更新日期:2020-01-16
  • Learning target-focusing convolutional regression model for visual object tracking
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-16
    Di Yuan; Nana Fan; Zhenyu He

    Discriminative correlation filters (DCFs) have been widely used in the tracking community recently. DCFs-based trackers utilize samples generated by circularly shifting from an image patch to train a ridge regression model, and estimate target location using a response map generated by the correlation filters. However, the generated samples produce some negative effects and the response map is vulnerable to noise interference, which degrades tracking performance. In this paper, to solve the aforementioned drawbacks, we propose a target-focusing convolutional regression (CR) model for visual object tracking tasks (called TFCR). This model uses a target-focusing loss function to alleviate the influence of background noise on the response map of the current tracking image frame, which effectively improves the tracking accuracy. In particular, it can effectively balance the disequilibrium of positive and negative samples by reducing some effects of the negative samples that act on the object appearance model. Extensive experimental results illustrate that our TFCR tracker achieves competitive performance compared with state-of-the-art trackers. The code is available at: https://github.com/deasonyuan/TFCR.

    更新日期:2020-01-16
  • Label propagation-based approach for detecting review spammer groups on e-commerce websites
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-16
    Fuzhi Zhang; Xiaoyan Hao; Jinbo Chao; Shuai Yuan

    Online product reviews are very important information resources on e-commerce websites and significantly influence consumers’ purchase decisions. Driven by interests, however, some merchants might hire a group of reviewers working together to promote or demote a set of target products by writing fake reviews. Such a collusive fraudulent reviewer group is generally termed a review spammer group and is more harmful to e-commerce websites than individual review spammers. To address this issue, in this paper we propose a label propagation-based approach to detect review spammer groups on e-commerce websites. First, based on the evaluation data of reviewers, we extract the associations between reviewers with respect to review time and product ratings to construct a relationship graph of reviewers. Second, we propose an improved label propagation algorithm with a propagation intensity and an automatic filtering mechanism to find candidate spammer groups based on the constructed reviewer relationship graph. Finally, we propose a ranking algorithm that combines the entropy method and the analytic hierarchy process to rank the candidate spammer groups and thus identify the top-k review spammer groups. The experimental results of the real-world Amazon and Yelp datasets show that the proposed approach performs better than the baseline methods.

    更新日期:2020-01-16
  • Subspace clustering by simultaneously feature selection and similarity learning
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-16
    Guo Zhong; Chi-Man Pun

    Learning a reliable affinity matrix is the key to achieving good performance for graph-based clustering methods. However, most of the current work usually directly constructs the affinity matrix from the raw data. It may seriously affect the clustering performance since the original data usually contain noises, even redundant features. On the other hand, although integrating manifold regularization into the framework of clustering algorithms can improve clustering results, some entries of the pre-computed affinity matrix on the original data may not reflect the true similarities between data points. To address the above issues, we propose a novel subspace clustering method to simultaneously learn the similarities between data points and conduct feature selection in a unified optimization framework. Specifically, we learn a high-quality graph under the guidance of a low-dimensional space of the original data such that the obtained affinity matrix can reflect the true similarities between data points as much as possible. A new algorithm based on augmented Lagrangian multiplier is designed to find the optimal solution to the problem effectively. Extensive experiments are conducted on benchmark datasets to demonstrate that our proposed method performs better against the state-of-the-art clustering methods.

    更新日期:2020-01-16
  • Rule-based granular classification: A hypersphere information granule-based method
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-15
    Chen Fu; Wei Lu; Witold Pedrycz; Jianhua Yang

    As fundamental abstract constructs supporting the human-centered way of granular computing (GrC), information granules can be used to distinguish different classes of data from the perspective of easily understood geometrical structure. In this study, a three-stage rule-based granular classification method is proposed using a union of a series of hypersphere information granules. The first stage focuses on dividing each class of data into a series of chunks. The second stage concerns the construction of some hyperspheres over these chunks. These resulting hyperspheres form a union information granule to depict the key structural characteristics of the corresponding data through their union operation. At the final stage, the union information granules are refined and the rule-based granular classification model is emerged through using a series of “If-Then” rules to articulate the refined union information granule formed on each class with the corresponding class label. A number of experiments involving several synthetic and publicly available datasets are implemented to exhibit the advantages of the resulting classifier. The impacts of critical parameters on the performance of the constructed classifier are also revealed.

    更新日期:2020-01-15
  • Zero-shot learning by mutual information estimation and maximization
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-14
    Chenwei Tang; Xue Yang; Jiancheng Lv; Zhenan He

    The key of zero-shot learning is to use the visual-semantic embedding to transfer the knowledge from seen classes to unseen classes. In this paper, we propose to build the visual-semantic embedding by maximizing the mutual information between visual features and corresponding attributes. Then, the mutual information between visual and semantic features can be utilized to guide the knowledge transfer from seen domain to unseen domain. Since we are primarily interested in maximizing mutual information, we introduce the noise-contrastive estimation to calculate lower-bound value of mutual information. Through the noise-contrastive estimation, we reformulate zero-shot learning as a binary classification problem, i.e., classifying the matching visual-semantic pairs (positive samples) and mismatching visual-semantic pairs (negative/noise samples). Experiments conducted on five datasets demonstrate that the proposed mutual information estimators outperforms current state-of-the-art methods both in conventional and generalized zero-shot learning settings.

    更新日期:2020-01-14
  • Learning hierarchical concepts based on higher-order fuzzy semantic cell models through the feed-upward mechanism and the self-organizing strategy
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-13
    Yongchuan Tang; Yunsong Xiao

    Concept representation and learning is a basic topic of artificial intelligence. The aim of this paper is to explore the representation issue and the learning issue of abstract concepts. In this paper, we first introduce higher-order fuzzy semantic cell models to represent abstract concepts, based on which we develop a hierarchical representation of concepts called abstract concept graphs. Then, we put forward an unsupervised algorithm to learn a second-order abstract concept graph from a given data set. This method combines the feed-upward mechanism and the self-organizing strategy. In addition, we provide an evaluation metric for this learning algorithm. A series of experiments is provided to demonstrate the feasibility and validity of the proposed method. We also conduct a preliminary exploration of the potential application of this method to image segmentation.

    更新日期:2020-01-14
  • Graph-regularized least squares regression for multi-view subspace clustering
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-13
    Yongyong Chen; Shuqin Wang; Fangying Zheng; Yigang Cen

    Many works have proven that the consistency and differences in multi-view subspace clustering make the clustering results better than the single-view clustering. Therefore, this paper studies the multi-view clustering problem, which aims to divide data points into several groups using multiple features. However, existing multi-view clustering methods fail to capturing the grouping effect and local geometrical structure of the multiple features. In order to solve these problems, this paper proposes a novel multi-view subspace clustering model called graph-regularized least squares regression (GLSR), which uses not only the least squares regression instead of the nuclear norm to generate grouping effect, but also the manifold constraint to preserve the local geometrical structure of multiple features. Specifically, the proposed GLSR method adopts the least squares regression to learn the globally consensus information shared by multiple views and the column-sparsity norm to measure the residual information. Under the alternating direction method of multipliers framework, an effective method is developed by iteratively update all variables. Numerical studies on eight real databases demonstrate the effectiveness and superior performance of the proposed GLSR over eleven state-of-the-art methods.

    更新日期:2020-01-13
  • Deep Collaborative Embedding for information cascade prediction
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-11
    Yuhui Zhao; Ning Yang; Tao Lin; Philip S. Yu

    Recently, information cascade prediction has attracted increasing interest from researchers, but it is far from being well solved partly due to the three defects of the existing works. First, the existing works often assume an underlying information diffusion model, which is impractical in real world due to the complexity of information diffusion. Second, the existing works often ignore the prediction of the infection order, which also plays an important role in social network analysis. At last, the existing works often depend on the requirement of underlying diffusion networks which are likely unobservable in practice. In this paper, we aim at the prediction of both node infection and infection order without requirement of the knowledge about the underlying diffusion mechanism and the diffusion network, where the challenges are two-fold. The first is what cascading characteristics of nodes should be captured and how to capture them, and the second is that how to model the non-linear features of nodes in information cascades. To address these challenges, we propose a novel model called Deep Collaborative Embedding (DCE) for information cascade prediction, which can capture not only the node structural property but also two kinds of node cascading characteristics. We propose an auto-encoder based collaborative embedding framework to learn the node embeddings with cascade collaboration and node collaboration, in which way the non-linearity of information cascades can be effectively captured. The results of extensive experiments conducted on real-world datasets verify the effectiveness of our approach.

    更新日期:2020-01-13
  • Particle filtering methods for stochastic optimization with application to large-scale empirical risk minimization
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-10
    Bin Liu

    This paper is concerned with sequential filtering based stochastic optimization (FSO) approaches that leverage a probabilistic perspective to implement the incremental proximity method (IPM). The present FSO methods are derived based on the Kalman filter (KF) and the extended KF (EKF). In contrast with typical methods such as stochastic gradient descent (SGD) and IPMs, they do not need to pre-schedule the learning rate for convergence. Nevertheless, they have limitations that inherit from the KF mechanism. As the particle filtering (PF) method outperforms KF and its variants remarkably for nonlinear non-Gaussian sequential filtering problems, it is natural to ask if FSO methods can benefit from PF to get around of their limitations. We provide an affirmative answer to this question by developing two PF based stochastic optimizers (PFSOs). For performance evaluation, we apply them to address nonlinear least-square fitting with simulated data, and empirical risk minimization for binary classification of real data sets. Experimental results demonstrate that PFSOs outperform remarkably a benchmark SGD algorithm, the vanilla IPM, and KF-type FSO methods in terms of numerical stability, convergence speed, and flexibility in handling diverse types of loss functions.

    更新日期:2020-01-11
  • GMM: A generalized mechanics model for identifying the importance of nodes in complex networks
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-10
    Fan Liu; Zhen Wang; Yong Deng

    How to assess the importance of nodes in the network is an open question. There are many ways to identify the importance of nodes in complex networks. However, these methods have their own shortcomings and advantages. In particular, some methods based on the importance of nodes between interactions between nodes have been proposed. These methods utilize local information or path information. How to combine local and global information is still a problem. In this paper, a generalized mechanical model is proposed that uses global information and local information. To verify the effectiveness of the method, some experiments were performed on a total of ten real networks. In particular, an innovative experimental network-based quality assessment was proposed to validate the method of identifying the importance of nodes.

    更新日期:2020-01-11
  • Machine learning based decision making for time varying systems: Parameter estimation and performance optimization
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-10
    Yiyang Chen; Yingwei Zhou

    The class of decision making problems focuses on the optimization of single or multiple design objectives, and the classical decision making procedures require the full scope of the system information. However, the system dynamics consist of unknown time varying parameters within a specific range of dynamic decision making problems, which cannot be handled by the classical procedures. To solve these problems, this paper proposes a machine learning based decision making algorithm. It uses the technique of machine learning to estimate the real-time unknown parameters using the recorded system data, and makes appropriate decisions using model predictive control (MPC) method to optimize some desired key performance indicators (KPIs). The effective performance of the proposed algorithm is further evaluated using a simulation based case study.

    更新日期:2020-01-11
  • A comprehensive exploration of semantic relation extraction via pre-trained CNNs
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-10
    Qing Li; Lili Li; Weinan Wang; Qi Li; Jiang Zhong

    Semantic relation extraction between entity pairs is a crucial task in information extraction from text. In this paper, we propose a new pre-trained network architecture for this task, and it is called the XM-CNN. The XM-CNN utilizes word embedding and position embedding information.It is designed to reinforce the contextual output from the MT-DNNKD pre-trained model. Our model effectively utilized an entity-aware attention mechanisms to detected the features and also adopts and applies more relation-specific pooling attention mechanisms applied to it. The experimental results show that the XM-CNN achieves state-of-the-art results on the SemEval-2010 task 8, and a thorough evaluation of the method is conducted.

    更新日期:2020-01-11
  • Multiscale cascading deep belief network for fault identification of rotating machinery under various working conditions
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-09
    Xiaoan Yan; Ying Liu; Minping Jia

    Deep learning is characterized by strong self-learning and fault classification ability without manually feature extraction stage of traditional algorithms. Deep belief network (DBN) is one of the most classic models of deep learning. However, traditional DBN is mainly restricted to learn automatically single scale features from raw vibration signal while identify the fault type, which implies some important information inherent in other scales of vibration data are neglected, thus causing easily unsatisfactory diagnosis result. To alleviate the problem, this paper presents a novel architecture named multiscale cascading deep belief network (MCDBN) for automatic fault identification of rotating machinery, which is aimed at learning the broader feature representation and improving the recognition precision. Firstly, a sliding window with data overlap is adopted to split the collected raw vibration signal to a group of equal-sized sub-signal, and then the improved multiscale coarse-grained procedure of each sub-signal is conducted to obtain the coarse-grained time series at different scales. Meanwhile, Fourier spectrum at different scales is calculated to capture multiscale characteristics. Finally, multiple DBN architecture with three hidden layers are designed to learn high-level feature representation directly from multiscale characteristics in a parallel manner and accomplish fault identification automatically through cascading way and softmax classifier without artificial expertise. Results of two experimental cases with respect to mechanical fault identification under different working conditions have well indicated that the proposed method is provided with preferable diagnostic performance compared with standard DBN and traditional multiscale feature extractors.

    更新日期:2020-01-11
  • Bag-of-Concepts representation for document classification based on automatic knowledge acquisition from probabilistic knowledge base
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-09
    Pengfei Li; Kezhi Mao; Yuecong Xu; Qi Li; Jiaheng Zhang

    Text representation, a crucial step for text mining and natural language processing, concerns about transforming unstructured textual data into structured numerical vectors to support various machine learning and data mining algorithms. For document classification, one classical and commonly adopted text representation method is Bag-of-Words (BoW) model. BoW represents document as a fixed-length vector of terms, where each term dimension is a numerical value such as term frequency or tf-idf weight. However, BoW simply looks at surface form of words. It ignores the semantic, conceptual and contextual information of texts, and also suffers from high dimensionality and sparsity issues. To address the aforementioned issues, we propose a novel document representation scheme called Bag-of-Concepts (BoC), which automatically acquires useful conceptual knowledge from external knowledge base, then conceptualizes words and phrases in the document into higher level semantics (i.e. concepts) in a probabilistic manner, and eventually represents a document as a distributed vector in the learned concept space. By utilizing background knowledge from knowledge base, BoC representation is able to provide more semantic and conceptual information of texts, as well as better interpretability for human understanding. We also propose Bag-of-Concept-Clusters (BoCCl) model which clusters semantically similar concepts together and performs entity sense disambiguation to further improve BoC representation. In addition, we combine BoCCl and BoW representaions using an attention mechanism to effectively utilize both concept-level and word-level information and achieve optimal performance for document classification.

    更新日期:2020-01-09
  • Cross Multi-Type Objects Clustering in Attributed Heterogeneous Information Network
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-07
    Sheng Zhou; Jiajun Bu; Zhen Zhang; Can Wang; Lingzhou Ma; Jianfeng Zhang

    Real-world networks usually consist of a large number of interacting, multi-typed components which are usually referred as heterogeneous information networks (HIN). HIN that associated with various attributes on nodes is defined as attributed HIN (or AHIN). Clustering is a fundamental task for HIN and AHIN. However, most of the current existing methods focus on single type nodes and there is very limited existing work that groups objects of different types into the same cluster. This is largely due to the reasons that object similarities can either be attribute-based or link-based between same type of nodes and it is challenging to incorporate both in a unified framework. To bridge this gap, in this paper, we propose a framework, namely Cross Multi-Type Objects Clustering in Attributed Heterogeneous Information Network, or CMOC-AHIN, to integrate both the attribute information and multi-type node clustering in a principled way. We empirically show superior performances of CMOC-AHINon three large scale challenging data sets and also summarize insights on the performances compared to other state-of-the-arts methodologies.

    更新日期:2020-01-07
  • On rule acquisition methods for data classification in heterogeneous incomplete decision systems
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-07
    Zuqiang Meng; Zhongzhi Shi

    In the age of big data, lots of data obtained is low-quality data characterized by heterogeneousness and incompleteness, referred to as heterogeneous incomplete decision systems (HIDSs) in this paper. Data classification is an important task in machine learning, with the ability to discover valuable knowledge hidden in HIDSs. However, systematic studies on data classification in HIDSs are rarely reported. Especially, there is a lack of adaptive classification methods for HIDSs, which can deal directly with heterogeneous incomplete data and do not require prior discretization of numerical attributes or filling in missing values. In this paper, a unified representation model, called parameterized tolerance granulation model (PTGM), is proposed to deal with heterogeneous incomplete data. And the principle of an adaptive granulation method of constructing appropriate PTGMs is also described using difference-based collaborative optimization. Based on PTGMs, decision logic language is used to describe classifiers consisting of decision rules satisfying given conditions. Then, a discernibility function-based and a heuristic function-based classification methods are proposed to obtain all optimized rule sets (classifiers) and to generate a particular optimized rule set, respectively. The heuristic function-based method is actually an adaptive classification method, which can deal directly with heterogeneous incomplete data. Furthermore, detailed theoretical analyses are given to illustrate the correctness and effectiveness of the proposed methods. The experimental results show that the proposed methods are effective and have obvious advantages in directly handling heterogeneous incomplete data.

    更新日期:2020-01-07
  • A temporal-window framework for modeling and forecasting time series
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-07
    Paulo S.G. de Mattos Neto; George D.C. Cavalcanti; Paulo R.A. Firmino; Eraylson G. Silva; Sérgio R.P. Vila Nova Filho

    Time series have become a valuable source of study in many areas, mainly because it encapsulates some underlying time-index variables. A significant part of these studies is dedicated to fit a single model to the past data to forecast future values of the series. However, single models may not be able to adequately fit local patterns; that is, particular and eventually recurrent variations dynamically incorporated in the series as time evolves. This temporal-window oriented paradigm has been at the vanguard of time series modeling and forecasting exercises. The present paper proposes a simple local-pattern oriented system to model and forecast time series. Our approach involves three steps: (i) the time series is split into k subsets in such a way that each subset may intercept its neighbors; (ii) each subset is modelled, considering lags according to confidence intervals of the auto-correlation function; and (iii) pattern recognition of the target values of the time series in relation to the modeled subsets, via dynamic time warping. The usefulness of the proposed framework is illustrated by modeling and forecasting real-world time series. Evaluation metrics were adopted to compare the proposed approach with multilayer perceptron neural networks and support vector regression predictors. The results provided by published models are also taken into account and it was found that the proposed system presented better performance than the compared models in the experiments.

    更新日期:2020-01-07
  • RuleKit: A comprehensive suite for rule-based learning
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-07
    Adam Gudyś; Marek Sikora; Łukasz Wróbel

    Rule-based models are often used for data analysis as they combine interpretability with predictive power. We present RuleKit, a versatile tool for rule learning. Based on a sequential covering induction algorithm, it is suitable for classification, regression, and survival problems. The presence of a user-guided induction facilitates verifying hypotheses concerning data dependencies which are expected or of interest. The powerful and flexible experimental environment allows straightforward investigation of different induction schemes. The analysis can be performed in batch mode, through RapidMiner plug-in, or R package. The software is available at GitHub (https://github.com/adaa-polsl/RuleKit) under GNU AGPL-3.0 license.

    更新日期:2020-01-07
  • Fast discrete factorization machine for personalized item recommendation
    Knowl. Based Syst. (IF 5.101) Pub Date : 2020-01-07
    Shilin Qu; Guibing Guo; Yuan Liu; Yuan Yao; Wei Wei

    Personalized item recommendation has become an essential target of Web applications, but it suffers from the efficiency problem due to a large volume of data. In particular, feature-based factorization machine models are generally limited by the vast number of feature dimensions, leading to catastrophic computation time. In this paper, we propose a Fast Discrete Factorization Machine (FDFM) method to resolve these issues by applying the hash coding technologies to factorization machine models. Specifically, it discretizes the real-valued feature vectors in the parameter model during the process of learning personalized item rankings, whereby the overall computational time can be greatly reduced. Besides, we propose convergence update rules to optimize the quantization loss of the binarization problem, which can be used in personalized ranking scenarios efficiently. Based on the evaluation in two real-world datasets, our proposed approach consistently shows better performance than other baselines, especially when using shorter binary codes.

    更新日期:2020-01-07
Contents have been reproduced by permission of the publishers.
导出
全部期刊列表>>
2020新春特辑
限时免费阅读临床医学内容
ACS材料视界
科学报告最新纳米科学与技术研究
清华大学化学系段昊泓
自然科研论文编辑服务
中国科学院大学楚甲祥
上海纽约大学William Glover
中国科学院化学研究所
课题组网站
X-MOL
北京大学分子工程苏南研究院
华东师范大学分子机器及功能材料
中山大学化学工程与技术学院
试剂库存
天合科研
down
wechat
bug