
显示样式: 排序: IF: - GO 导出
-
2020 Index IEEE Transactions on Knowledge and Data Engineering Vol. 32 IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2020-12-07
This index covers all technical items - papers, correspondence, reviews, etc. - that appeared in this periodical during the year, and items from previous years that were commented upon or corrected in this year. Departments and other items may also be covered if they have been judged to have archival value. The Author Index contains the primary entry for each item, listed under the first author's name
-
Corrections to “NATERGM: A Model for Examining the Role of Nodal Attributes in Dynamic Social Media Networks” IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2020-11-06 Shan Jiang; Hsinchun Chen
Presents corrections to affiliation information in the above named paper.
-
A Reliable Storage Partition for Permissioned Blockchain IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2020-07-29 Xiaodong Qi; Zhao Zhang; Cheqing Jin; Aoying Zhou
The full-replication data storage mechanism, as commonly utilized in existing blockchains, is the barrier to the system's scalability, since it retains a copy of entire blockchain at each node so that the overall storage consumption per block is $O(n)$ with $n$ participants. Yet another drawback is that this mechanism may limit the throughput in permissioned blockchain. Moreover, due to the existence
-
A Hybrid E-Learning Recommendation Approach Based on Learners’ Influence Propagation IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-25 Shanshan Wan; Zhendong Niu
In e-learning recommender systems, interpersonal information between learners is very scarce, which makes it difficult to apply collaborative filtering (CF) techniques to achieve recommendations. In this study, we propose a hybrid filtering recommendation approach ( $SI-IFL$SI-IFL ) combining learner influence model (LIM), self-organization based (SOB) recommendation strategy, and sequential pattern
-
A Scalable Multi-Data Sources Based Recursive Approximation Approach for Fast Error Recovery in Big Sensing Data on Cloud IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-28 Chi Yang; Xianghua Xu; Kotagiri Ramamohanarao; Jinjun Chen
Big sensing data is commonly encountered from various surveillance or sensing systems. Sampling and transferring errors are commonly encountered during each stage of sensing data processing. How to recover from these errors with accuracy and efficiency is quite challenging because of high sensing data volume and unrepeatable wireless communication environment. While Cloud provides a promising platform
-
Adversarial Training Towards Robust Multimedia Recommender System IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-18 Jinhui Tang; Xiaoyu Du; Xiangnan He; Fajie Yuan; Qi Tian; Tat-Seng Chua
With the prevalence of multimedia content on the Web, developing recommender solutions that can effectively leverage the rich signal in multimedia data is in urgent need. Owing to the success of deep neural networks in representation learning, recent advances on multimedia recommendation has largely focused on exploring deep learning methods to improve the recommendation accuracy. To date, however
-
ASCENT: Active Supervision for Semi-Supervised Learning IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-02-04 Yanchao Li; Yongli Wang; Dong-Jun Yu; Ning Ye; Peng Hu; Ruxin Zhao
Active learning algorithms attempt to overcome the labeling bottleneck by asking queries from large collection of unlabeled examples. Existing batch mode active learning algorithms suffer from three limitations: (1) The methods that are based on similarity function or optimizing certain diversity measurement, in which may lead to suboptimal performance and produce the selected set with redundant examples
-
Boosting with Lexicographic Programming: Addressing Class Imbalance without Cost Tuning IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-22 Shounak Datta; Sayak Nag; Swagatam Das
A large amount of research effort has been dedicated to adapting boosting for imbalanced classification. However, boosting methods are yet to be satisfactorily immune to class imbalance, especially for multi-class problems. This is because most of the existing solutions for handling class imbalance rely on expensive cost set tuning for determining the proper level of compensation. We show that the
-
Control-Flow Modeling with Declare: Behavioral Properties, Computational Complexity, and Tools IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-02-04 Valeria Fionda; Antonella Guzzo
Declarative approaches to control-flow modeling use logic-based languages to formalize a number of constraints that valid traces must satisfy. The most noticeable example is the Declare framework based on linear temporal logic. Despite the interest that Declare has been attracting, the current knowledge about its formal properties was rather limited. The goal of this paper is to fill this gap by: (i)
-
Efficient Entity Resolution on Heterogeneous Records IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-02-07 Yiming Lin; Hongzhi Wang; Jianzhong Li; Hong Gao
Entity resolution (ER) is the problem of identifying and merging records that refer to the same real-world entity. In many scenarios, raw records are stored under heterogeneous environment. Specifically, the schemas of records may differ from each other. To leverage such records better, most existing work assume that schema matching and data exchange have been done to convert records under different
-
Efficient Process Conformance Checking on the Basis of Uncertain Event-to-Activity Mappings IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-02-05 Han van der Aa; Henrik Leopold; Hajo A. Reijers
Conformance checking enables organizations to automatically identify compliance violations based on the analysis of observed event data. A crucial requirement for conformance-checking techniques is that observed events can be mapped to normative process models used to specify allowed behavior. Without a mapping, it is not possible to determine if an observed event trace conforms to the specification
-
Generalized Translation-Based Embedding of Knowledge Graph IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-18 Takuma Ebisu; Ryutaro Ichise
Knowledge graphs are useful for many AI tasks but often have missing facts. To populate the graphs, knowledge graph embedding models have been developed. TransE is one of such models and the first translation-based method. TransE is well known because the principle of TransE can effectively capture the rules of a knowledge graph although it seems very simple. However, TransE has problems with its regularization
-
Joint Label Prediction Based Semi-Supervised Adaptive Concept Factorization for Robust Data Representation IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-22 Zhao Zhang; Yan Zhang; Guangcan Liu; Jinhui Tang; Shuicheng Yan; Meng Wang
Constrained Concept Factorization (CCF) yields the enhanced representation ability over CF by incorporating label information as additional constraints, but it cannot classify and group unlabeled data appropriately. Minimizing the difference between the original data and its reconstruction directly can enable CCF to model a small noisy perturbation, but is not robust to gross sparse errors. Besides
-
Joint Learning of Question Answering and Question Generation IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-02-06 Yibo Sun; Duyu Tang; Nan Duan; Tao Qin; Shujie Liu; Zhao Yan; Ming Zhou; Yuanhua Lv; Wenpeng Yin; Xiaocheng Feng; Bing Qin; Ting Liu
Question answering (QA) and question generation (QG) are closely related tasks that could improve each other; however, the connection of these two tasks is not well explored in the literature. In this paper, we present two training algorithms for learning better QA and QG models through leveraging one another. The first algorithm extends Generative Adversarial Network (GAN), which selectively incorporates
-
K-SPIN: Efficiently Processing Spatial Keyword Queries on Road Networks IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-21 Tenindra Abeywickrama; Muhammad Aamir Cheema; Arijit Khan
A significant proportion of all search volume consists of local searches. As a result, search engines must be capable of finding relevant results combining both spatial proximity and textual relevance with high query throughput. We observe that existing techniques answering these spatial keyword queries use keyword aggregated indexing, which has several disadvantages on road networks. We propose K-SPIN
-
Quality Control in Crowdsourcing Using Sequential Zero-Determinant Strategies IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-02-01 Qin Hu; Shengling Wang; Peizi Ma; Xiuzhen Cheng; Weifeng Lv; Rongfang Bie
Quality control in crowdsourcing is challenging due to the heterogeneous nature of the workers. The state-of-the-art solutions attempt to address the issue from the technical perspective, which may be costly because they function as an additional procedure in crowdsourcing. In this paper, an economics based idea is adopted to embed quality control into the crowdsourcing process, where the requestor
-
Semi-Supervised Deep Learning Approach for Transportation Mode Identification Using GPS Trajectory Data IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-02-01 Sina Dabiri; Chang-Tien Lu; Kevin Heaslip; Chandan K. Reddy
Identification of travelers’ transportation modes is a fundamental step for various problems that arise in the domain of transportation such as travel demand analysis, transport planning, and traffic management. In this paper, we aim to identify travelers’ transportation modes purely based on their GPS trajectories. First, a segmentation process is developed to partition a user's trip into GPS segments
-
Trust Relationship Prediction in Alibaba E-Commerce Platform IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-18 Yukuo Cen; Jing Zhang; Gaofei Wang; Yujie Qian; Chuizheng Meng; Zonghong Dai; Hongxia Yang; Jie Tang
This paper introduces how to infer trust relationships from billion-scale networked data to benefit Alibaba E-Commerce business. To effectively leverage the network correlations between labeled and unlabeled relationships to predict trust relationships, we formalize trust into multiple types and propose a graphical model to incorporate type-based dyadic and triadic correlations, namely eTrust. We also
-
A Hybrid Discriminative Mixture Model for Cumulative Citation Recommendation IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-16 Lerong Ma; Dandan Song; Lejian Liao; Jingang Wang
This paper explores Cumulative Citation Recommendation (CCR) for Knowledge Base Acceleration (KBA). The CCR task aims to detect potential citations of a set of target entities with priorities from a volume of temporally-ordered stream corpus. Previous approaches for CCR that build an individual relevance model for each entity fail to deal with unseen entities without annotation. A compromised solution
-
Addressing the Item Cold-Start Problem by Attribute-Driven Active Learning IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-09 Yu Zhu; Jinghao Lin; Shibi He; Beidou Wang; Ziyu Guan; Haifeng Liu; Deng Cai
In recommender systems, cold-start issues are situations where no previous events, e.g., ratings, are known for certain users or items. In this paper, we focus on the item cold-start problem. Both content information (e.g., item attributes) and initial user ratings are valuable for seizing users’ preferences on a new item. However, previous methods for the item cold-start problem either (1) incorporate
-
An Efficient Approach to Finding Dense Temporal Subgraphs IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-09 Shuai Ma; Renjun Hu; Luoshu Wang; Xuelian Lin; Jinpeng Huai
Dense subgraph discovery has proven useful in various applications of temporal networks. We focus on a special class of temporal networks whose nodes and edges are kept fixed, but edge weights regularly vary with timestamps. However, finding dense subgraphs in temporal networks is non-trivial, and its state of the art solution uses a filter-and-verification framework that is not scalable on large temporal
-
Feature Selection for Neural Networks Using Group Lasso Regularization IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-16 Huaqing Zhang; Jian Wang; Zhanquan Sun; Jacek M. Zurada; Nikhil R. Pal
We propose an embedded/integrated feature selection method based on neural networks with Group Lasso penalty. Group Lasso regularization is considered to produce sparsity on the inputs to the network, i.e., for selection of useful features. Lasso based feature selection using a multi-layer perceptron usually requires an additional set of weights, while our Group Lasso formulation does not require that
-
GERF: A Group Event Recommendation Framework Based on Learning-to-Rank IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-16 Yulu Du; Xiangwu Meng; Yujie Zhang; Pengtao Lv
Event recommendation is an essential means to enable people to find attractive upcoming social events, such as party, exhibition, and concert. While growing line of research has focused on suggesting events to individuals, making event recommendation for a group of users has not been well studied. In this paper, we aim to recommend upcoming events for a group of users. We formalize group recommendation
-
Jointly Learning Topics in Sentence Embedding for Document Summarization IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-14 Yang Gao; Yue Xu; Heyan Huang; Qian Liu; Linjing Wei; Luyang Liu
Summarization systems for various applications, such as opinion mining, online news services, and answering questions, have attracted increasing attention in recent years. These tasks are complicated, and a classic representation using bag-of-words does not adequately meet the comprehensive needs of applications that rely on sentence extraction. In this paper, we focus on representing sentences as
-
Multi-Campaign Oriented Spatial Crowdsourcing IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-16 Libin Zheng; Lei Chen
Recently, spatial crowdsourcing has been drawing increasing attention with its great potential in collecting geographical knowledge. The system throughput (number of assigned tasks) and workers’ travel distance are two of many important factors in spatial crowdsourcing, and the improvement to one of them usually means the sacrifice of the other. However, most existing works resolve the trade-off between
-
NewMCOS: Towards a Practical Multi-Cloud Oblivious Storage Scheme IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-09 Zheli Liu; Bo Li; Yanyu Huang; Jin Li; Yang Xiang; Witold Pedrycz
Encryption alone is not enough to protect data privacy, because access pattern leaks some sensitive information. Oblivious RAM (ORAM), the solution to this problem, is still far from practical deployment for heavy storage and communication/ computation overhead. To reduce them, an insightful idea was proposed to utilize non-colluding clouds to shift client computation and client-cloud communication
-
On Combining Biclustering Mining and AdaBoost for Breast Tumor Classification IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-14 Qinghua Huang; Yongdong Chen; Longzhong Liu; Dacheng Tao; Xuelong Li
Breast cancer is now considered as one of the leading causes of deaths among women all over the world. Aiming to assist clinicians in improving the accuracy of diagnostic decisions, computer-aided diagnosis (CAD) system is of increasing interest in breast cancer detection and analysis nowadays. In this paper, a novel computer-aided diagnosis scheme with human-in-the-loop is proposed to help clinicians
-
Reducing Web Page Complexity to Facilitate Effective User Navigation IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-16 Min Chen
As a website evolves to align with users’ changing information needs and interests, its structure can outgrow the original design, accumulating links and pages in unanticipated places. This increases complexity to both web pages and the navigation structure, which could cause difficulty in locating relevant links and information. Though the increasing complexity of website and its impact on users’
-
Scalable Spectral Clustering for Overlapping Community Detection in Large-Scale Networks IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-10 Hadrien Van Lierde; Tommy W. S. Chow; Guanrong Chen
While the majority of methods for community detection produce disjoint communities of nodes, most real-world networks naturally involve overlapping communities. In this paper, a scalable method for the detection of overlapping communities in large networks is proposed. The method is based on an extension of the notion of normalized cut to cope with overlapping communities. A spectral clustering algorithm
-
Similarity Join and Similarity Self-Join Size Estimation in a Streaming Environment IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-16 Davood Rafiei; Fan Deng
We study the problem of similarity self-join and similarity join size estimation in a streaming setting where the goal is to estimate, in one scan of the input and with sublinear space in the input size, the number of record pairs that have a similarity within a given threshold. The problem has many applications in data cleaning and query plan generation, where the cost of a similarity join may be
-
SRA: Secure Reverse Auction for Task Assignment in Spatial Crowdsourcing IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-16 Mingjun Xiao; Kai Ma; An Liu; Hui Zhao; Zhixu Li; Kai Zheng; Xiaofang Zhou
In this paper, we study a new type of spatial crowdsourcing, namely competitive detour tasking, where workers can make detours from their original travel paths to perform multiple tasks, and each worker is allowed to compete for preferred tasks by strategically claiming his/her detour costs. The objective is to make suitable task assignment by maximizing the social welfare of crowdsourcing systems
-
Adaptive Consistency Propagation Method for Graph Clustering IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-08-20 Xuelong Li; Mulin Chen; Qi Wang
Graph clustering plays an important role in data mining. Based on an input data graph, data points are partitioned into clusters. However, most existing methods keep the data graph fixed during the clustering procedure, so they are limited to exploit the implied data manifold and highly dependent on the initial graph construction. Inspired by the recent development on manifold learning, this paper
-
Bayesian Networks for Data Integration in the Absence of Foreign Keys IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-09-10 Bohan Zhang; Scott Sanner; Mohamed Reda Bouadjenek; Shagun Gupta
In the era of open data, a single data source rarely contains all of the attributes we need for inference in specific applications. For example, a marketing department may aim to integrate retailer-specific purchase data with separate demographic data for purposes of targeted advertising – a capability not possible with either dataset alone. In this work, we address two key desiderata of an automated
-
Discrimination-Aware Projected Matrix Factorization IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-08-22 Xuelong Li; Mulin Chen; Qi Wang
Non-negative Matrix Factorization (NMF) has been one of the most popular clustering techniques in machine leaning, and involves various real-world applications. Most existing works perform matrix factorization on high-dimensional data directly. However, the intrinsic data structure is always hidden within the low-dimensional subspace. And, the redundant features within the input space may affect the
-
LN-SNE: Log-Normal Distributed Stochastic Neighbor Embedding for Anomaly Detection IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-08-14 Zahra Ghafoori; Sarah M. Erfani; James C. Bezdek; Shanika Karunasekera; Christopher Leckie
We present a new unsupervised dimensionality reduction technique, called LN-SNE, for anomaly detection. LN-SNE generates a parametric embedding by means of Restricted Boltzmann Machines and uses a heavy-tail distribution to project data to a lower dimensional space such that dissimilarities between normal data and anomalies are preserved or strengthened. We compare LN-SNE to several benchmark dimensionality
-
Nonparametric Density Estimation Using Copula Transform, Bayesian Sequential Partitioning, and Diffusion-Based Kernel Estimator IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-07-23 Aref Majdara; Saeid Nooshabadi
Non-parametric density estimation methods are more flexible than parametric methods, due to the fact that they do not assume any specific shape or structure for the data. Most non-parametric methods, like Kernel estimation, require tuning of parameters to achieve good data smoothing, a non-trivial task, even in low dimensions. In higher dimensions, sparsity of data in local neighborhoods becomes a
-
Personalized Video Recommendation Using Rich Contents from Videos IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2018-12-07 Xingzhong Du; Hongzhi Yin; Ling Chen; Yang Wang; Yi Yang; Xiaofang Zhou
Video recommendation has become an essential way of helping people explore the massive videos and discover the ones that may be of interest to them. In the existing video recommender systems, the models make the recommendations based on the user-video interactions and single specific content features. When the specific content features are unavailable, the performance of the existing models will seriously
-
A Transformation-based Framework for KNN Set Similarity Search IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2018-12-12 Yong Zhang; Jiacheng Wu; Jin Wang; Chunxiao Xing
Set similarity search is a fundamental operation in a variety of applications. While many previous studies focus on threshold based set similarity search and join, few efforts have been paid for KNN set similarity search. In this paper, we propose a transformation based framework to solve the problem of KNN set similarity search, which given a collection of set records and a query set, returns k results
-
ChronoGraph: Enabling temporal graph traversals for efficient information diffusion analysis over time IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-09 Jaewook Byun; Sungpil Woo; Daeyoung Kim
ChronoGraph is a novel system enabling temporal graph traversals. Compared to snapshot-oriented systems, this traversal-oriented system is suitable for analyzing information diffusion over time without violating a time constraint on temporal paths. The cornerstone of ChronoGraph aims at bridging the chasm between point-based semantics and period-based semantics and the gap between temporal graph traversals
-
Deep Inductive Graph Representation Learning IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2018-11-01 Ryan Anthony Rossi; Rong Zhou; Nesreen Ahmed
This paper presents a general inductive graph representation learning framework called DeepGL for learning deep node and edge features that generalize across-networks. In particular, DeepGL begins by deriving a set of base features from the graph (e.g., graphlet features) and automatically learns a multi-layered hierarchical graph representation where each successive layer leverages the output from
-
Dynamic Connection-based Social Group Recommendation IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2018-11-05 Dong Qin; Xiangmin Zhou; Lei Chen; Guangyan Huang; Yanchun Zhang
Group recommendation has become highly demanded when users communicate in the forms of group activities in online sharing communities. These group activities include student group study, family TV program watching, friends travel decision, etc. Existing group recommendation techniques mainly focus on the small user groups. However, online sharing communities have enabled group activities among thousands
-
Flow Prediction in Spatio-Temporal Networks Based on Multitask Deep Learning IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-09 Junbo Zhang; Yu Zheng; Junkai Sun; Dekang Qi
Predicting flows (e.g. the traffic of vehicles, crowds and bikes), consisting of the in-out traffic at a node and transitions between different nodes, in a spatio-temporal network plays an important role in transportation systems. However, this is a very challenging problem, affected by multiple complex factors, such as spatial correlations between different locations, temporal correlations among different
-
Learning New Words from Keystroke Data with Local Differential Privacy IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2018-12-07 Sungwook Kim; Hyejin Shin; Chung Hun Baek; Soohyung Kim; Junbum Shin
Keystroke data collected from smart devices includes various sensitive information about users. Collecting and analyzing such data raise serious privacy concerns. Google and Apple have recently applied local differential privacy (LDP) to address privacy issue on learning new words from users' keystroke data. However, these solutions require multiple LDP reports for a single word, which result in inefficient
-
Re-revisiting Learning on Hypergraphs: Confidence Interval, Subgradient Method and Extension to Multiclass IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2018-11-09 Chenzi Zhang; Shuguang Hu; Zhihao Gavin Tang; T-H. Hubert Chan
We revisit semi-supervised learning on hypergraphs. Same as previous approaches, our method uses a convex program whose objective function is not everywhere differentiable. We exploit the non-uniqueness of the optimal solutions, and consider confidence intervals which give the exact ranges that unlabeled vertices take in any optimal solution. Moreover, we give a much simpler approach for solving the
-
SAL-hashing: A Self-Adaptive Linear Hashing Index for SSDs IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2018-12-06 Peiquan Jin; Chengcheng Yang; Xiaoliang Wang; Lihua Yue; Dezhi Zhang
Flash memory based solid state drives (SSDs) have emerged as a new alternative to replace magnetic disks due to their high performance and low power consumption. However, random writes on SSDs are much slower than SSD reads. Therefore, traditional index structures, which are designed based on the symmetrical I/O property of magnetic disks, cannot completely exert the high performance of SSDs. In this
-
Scheduling Resources to Multiple Pipelines of One Query in a Main Memory Database Cluster IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2018-12-06 Zhuhe Fang; Chuliang Weng; Li Wang; Huiqi Hu; Aoying Zhou
To fully utilize the resources of a main memory database cluster, we additionally take the independent parallelism into account to parallelize multiple pipelines of one query. However, scheduling resources to multiple pipelines is an intractable problem. Traditional static approaches to this problem may lead to a serious waste of resources and suboptimal execution order of pipelines, because it is
-
Unlocking Author Power: On the Exploitation of Auxiliary Author-Retweeter Relations for Predicting Key Retweeters IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2018-12-27 Bo Wu; Wen-Huang Cheng; Yongdong Zhang; Juan Cao; Jintao Li; Tao Mei
Retweeting is a powerful driving force of information propagation on microblogging. How to identify effective retweeters of a message (called "key retweeter prediciton" problem) has then become a significant research topic. Conventional approaches addressed this topic mainly from two aspects by analyzing either the personal attributes of microblogging users or the network structure of user graphs.
-
User Interface Derivation for Business Processes IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-09 Lei Han; Jian Yang; Weiliang Zhao; Quan Z. Sheng
User Interfaces (UI) are the bridge to connect Business Processes (BPs) and end users. The implementation of UIs normally needs a lot of manual efforts of developers. Aiming to resolve this issue, this work proposes a UI derivation method with a role-enriched BP (REBP) model as its foundation. This process model has the capability to present the details of task control flow and data operations in tasks
-
Using Latent Knowledge to Improve Real-Time Activity Recognition for Smart IoT IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2019-01-09 Surong Yan; Kwei-Jay Lin; Xiaolin Zheng; Wenyu Zhang
Real-time/online activity recognition (AR) is an important technology in smart Internet of Things (IoT) systems where users are assisted by smart devices in their daily activities. How to generate appropriate feature representation from sensor event streaming is a challenging issue for accurate and efficient real-time AR. Previous AR models that rely on explicit domain knowledge are not appropriate
-
Utilizing Neural Networks and Linguistic Metadata for Early Detection of Depression Indications in Text Sequences IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2018-12-18 Marcel Trotzek; Sven Koitka; Christoph M. Friedrich
Depression is ranked as the largest contributor to global disability and is also a major reason for suicide. Still, many individuals suffering from forms of depression are not treated for various reasons. Previous studies have shown that depression also has an effect on language usage and that many depressed individuals use social media platforms or the internet in general to get information or discuss
-
VA-Store: A Virtual Approximate Store Approach to Supporting Repetitive Big Data in Genome Sequence Analyses IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2018-12-11 Xianying Liu; Qiang Zhu; Sakti Pramanik; C. Titus Brown; Gang Qian
In recent years, we have witnessed an increasing demand to process big data in numerous applications. It is observed that there often exist substantial amounts of repetitive data in different portions of a big data repository/dataset for applications such as genome sequence analyses. In this paper, we present a novel method, called the VA-Store, to reduce the large space requirement for repetitive
-
The Disruptions of 5G on Data-Driven Technologies and Applications IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2020-01-17 Dumitrel Loghin; Shaofeng Cai; Gang Chen; Tien Tuan Anh Dinh; Feiyi Fan; Qian Lin; Janice Ng; Beng Chin Ooi; Xutao Sun; Quang-Trung Ta; Wei Wang; Xiaokui Xiao; Yang Yang; Meihui Zhang; Zhonghua Zhang
With 5G on the verge of being adopted as the next mobile network, there is a need to analyze its impact on the landscape of computing and data management. In this paper, we analyze the impact of 5G on both traditional and emerging technologies and project our view on future research challenges and opportunities. With a predicted increase of 10-100x in bandwidth and 5-10x decrease in latency, 5G is
-
An Efficient Destination Prediction Approach Based on Future Trajectory Prediction and Transition Matrix Optimization IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2018-11-28 Zhou Yang; Heli Sun; Jianbin Huang; Zhongbin Sun; Hui Xiong; Shaojie Qiao; Ziyu Guan; Xiaolin Jia
Destination prediction is an essential task in various mobile applications and up to now many methods have been proposed. However, existing methods usually suffer from the problems of heavy computational burden, data sparsity, and low coverage. Therefore, a novel approach named DestPD is proposed to tackle the aforementioned problems. Differing from an earlier approach that only considers the starting
-
Anomaly Detection Using Local Kernel Density Estimation and Context-Based Regression IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2018-11-20 Weiming Hu; Jun Gao; Bing Li; Ou Wu; Junping Du; Stephen Maybank
Current local density-based anomaly detection methods are limited in that the local density estimation and the neighborhood density estimation are not accurate enough for complex and large databases, and the detection performance depends on the size parameter of the neighborhood. In this paper, we propose a new kernel function to estimate samples’ local densities and propose a weighted neighborhood
-
BRIGHT—Drift-Aware Demand Predictions for Taxi Networks IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2018-12-05 Amal Saadallah; Luís Moreira-Matias; Ricardo Sousa; Jihed Khiari; Erik Jenelius; João Gama
Massive data broadcast by GPS-equipped vehicles provide unprecedented opportunities. One of the main tasks in order to optimize our transportation networks is to build data-driven real-time decision support systems. However, the dynamic environments where the networks operate disallow the traditional assumptions required to put in practice many off-the-shelf supervised learning algorithms, such as
-
Conversion Prediction from Clickstream: Modeling Market Prediction and Customer Predictability IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2018-11-30 Jinyoung Yeo; Seung-won Hwang; sungchul kim; Eunyee Koh; Nedim Lipka
As 98 percent of shoppers do not make a purchase on the first visit, we study the problem of predicting whether they would come back for a purchase later (i.e., conversion prediction). This problem is important for strategizing “retargeting”, for example, by sending coupons for customers who are likely to convert. For this goal, we study the following two problems, prediction of market and predictability
-
Design and Implementation of SSD-Assisted Backup and Recovery for Database Systems IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2018-11-30 Yongseok Son; Moonsub Kim; Sunggon Kim; Heon Young Yeom; Nam Sung Kim; Hyuck Han
As flash-based solid-state drive (SSD) becomes more prevalent because of the rapid fall in price and the significant increase in capacity, customers expect better data services than traditional disk-based systems. However, the order of magnitude performance provided and new characteristics of flash require a rethinking of data services. For example, backup and recovery is an important service in a
-
Enriching Data Imputation under Similarity Rule Constraints IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2018-11-23 Shaoxu Song; Yu Sun; Aoqian Zhang; Lei Chen; Jianmin Wang
Incomplete information often occurs along with many database applications, e.g., in data integration, data cleaning, or data exchange. The idea of data imputation is often to fill the missing data with the values of its neighbors who share the same/similar information. Such neighbors could either be identified certainly by editing rules or extensively by similarity relationships. Owing to data sparsity
-
Fast and Low Memory Cost Matrix Factorization: Algorithm, Analysis, and Case Study IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2018-11-19 Yan Yan; Mingkui Tan; Ivor W. Tsang; Yi Yang; Qinfeng Shi; Chengqi Zhang
Matrix factorization has been widely applied to various applications. With the fast development of storage and internet technologies, we have been witnessing a rapid increase of data. In this paper, we propose new algorithms for matrix factorization with the emphasis on efficiency. In addition, most existing methods of matrix factorization only consider a general smooth least square loss. Differently
-
Learning to Weight for Text Classification IEEE Trans. Knowl. Data. Eng. (IF 4.935) Pub Date : 2018-11-28 Alejandro Moreo; Andrea Esuli; Fabrizio Sebastiani
In information retrieval (IR) and related tasks, term weighting approaches typically consider the frequency of the term in the document and in the collection in order to compute a score reflecting the importance of the term for the document. In tasks characterized by the presence of training data (such as text classification) it seems logical that the term weighting function should take into account