-
Computational Intelligence-Based Financial Crisis Prediction Model Using Feature Subset Selection with Optimal Deep Belief Network Big Data (IF 3.644) Pub Date : 2021-01-19 Noura Metawa; Irina V. Pustokhina; Denis A. Pustokhin; K. Shankar; Mohamed Elhoseny
At present times, financial decisions are mainly based on the classifier technique, which is utilized to allocate a collection of observations into fixed groups. A diverse set of data classifier approaches were presented for forecasting the financial crisis of an institution using the past data. An essential process toward the design of a precise financial crisis prediction (FCP) approach comprises
-
Communities Detection for Advertising by Futuristic Greedy Method with Clustering Approach Big Data (IF 3.644) Pub Date : 2021-01-12 Ali Bakhthemmat; Mohammad Izadi
Community detection in social networks is one of the advertising methods in electronic marketing. One of the approaches to find communities in large social networks is to use greedy methods, because these methods perform very fast. Greedy methods are generally designed based on local decisions; thus, inappropriate local decisions may result in an improper global solution. The use of a greedy improved
-
Measuring Customer Similarity and Identifying Cross-Selling Products by Community Detection Big Data (IF 3.644) Pub Date : 2020-12-29 Lili Zhang; Jennifer Priestley; Joseph DeMaio; Sherry Ni; Xiaoguang Tian
Product affinity segmentation discovers groups of customers with similar purchase preferences for cross-selling opportunities to increase sales and customer loyalty. However, this concept can be challenging to implement efficiently and effectively for actionable strategies. First, the nature of skewed and sparse product-level data in the clustering process results in less meaningful solutions. Second
-
Big Data in Business: A Bibliometric Analysis of Relevant Literature Big Data (IF 3.644) Pub Date : 2020-12-15 Haitham Nobanee
This special issue was open for submissions in the field of big data in business. Accordingly, this special issue includes five contributions to the fields of business process innovation in the big data era, unstructured big data analytical methods in firms, online analytical processing approach for business intelligence in big data, geospatial insights for retail recommendation using similarity measures
-
Toward Business Process Innovation in the Big Data Era: A Mediating Roles of Big Data Knowledge Management Big Data (IF 3.644) Pub Date : 2020-12-15 Saide Saide; Margaret L. Sheng
While recent debate recognizes the importance of big data (BD) and knowledge management (KM) in firm performance, there has been a paucity of literature regarding big data analytics technological (BDAT) and knowledge exploration–exploitation capabilities (KEEC) in the context of business process innovation (BPI). This study aims to identify whether BD and KM can be established in these emerging issues
-
On the Unstructured Big Data Analytical Methods in Firms: Conceptual Model, Measurement, and Perception Big Data (IF 3.644) Pub Date : 2020-12-15 Piotr Tarka; Elżbieta Jędrych
Firms face challenging analytical tasks at the advent of a growing amount of unstructured big data (BD). These data lead to radical shifts in their analytical strategies and market insights. Yet, the particular types of analytical methods remain in the literature still loosely scattered. This work stresses the unstructured BD analytics, first by capturing their unique characteristics and then by proposing
-
Online Analytical Processing for Business Intelligence in Big Data Big Data (IF 3.644) Pub Date : 2020-12-15 Jigna Ashish Patel; Priyanka Sharma
Online analytical processing (OLAP) approach is widely used in business intelligence to cater the multidimensional queries for decades. In this era of cutting-edge technology and the internet, data generation rates have been rising exponentially. Internet of things sensors and social media platforms are some of the major contributors, leading toward the absolute data boom. Storage and speed are the
-
Geospatial Insights for Retail Recommendation Using Similarity Measures Big Data (IF 3.644) Pub Date : 2020-12-15 Choo-Yee Ting; Chiung Ching Ho; Hui-Jia Yee
Recommending a retail business given a particular location of interest is nontrivial. Such a recommendation process requires careful study of demographics, trade area characteristics, sales performance, traffic, and environmental features. It is not only human effort taxing but often introduces inconsistency due to subjectivity in expert opinions. The process becomes more challenging when no sales
-
Overcoming Resistance to Big Data and Operational Changes Through Interactive Data Visualization. Big Data (IF 3.644) Pub Date : 2020-12-15 Gloria Phillips-Wren,Sueanne McKniff
Research has shown that the use of big data can modify operational processes in organizations. However, little research has been conducted on overcoming resistance to the process changes needed for adoption of big data technologies. In this article, we address this gap in the literature by investigating the impact of interactive data visualization on decision-making around operational process changes
-
Indirect Category Data Transfer Learning Algorithm using Regularization Discrimination Big Data (IF 3.644) Pub Date : 2020-12-15 Gang Liu; Xiaofeng Li; Wangyang Liu
To deal with a large amount of redundant data in the indirect category database and inefficient redundancy elimination of the existing methods, we proposed an indirect category data transfer learning algorithm based on regularization discrimination. First of all, we denoised indirect category data, calculated the objective function of distance between the source domain and the target domain, and established
-
Deep Learning for Time Series Forecasting: A Survey Big Data (IF 3.644) Pub Date : 2020-12-03 José F. Torres; Dalil Hadjout; Abderrazak Sebaa; Francisco Martínez-Álvarez; Alicia Troncoso
Time series forecasting has become a very intensive field of research, which is even increasing in recent years. Deep neural networks have proved to be powerful and are achieving high accuracy in many application fields. For these reasons, they are one of the most widely used methods of machine learning to solve problems dealing with big data nowadays. In this work, the time series forecasting problem
-
Learning from Failure: Big Data Analysis for Detecting the Patterns of Failure in Innovative Startups Big Data (IF 3.644) Pub Date : 2020-12-01 Maddalena Cavicchioli; Ulpiana Kocollari
This article aims at identifying appropriate models for analyzing large datasets to serve a twofold goal: first, to better understand the dynamics impacting innovative startups' performance and their managerial practice and, second, to detect their patterns of failure. Therefore, we investigate the interaction of economic–financial, context, and governance dimensions of 4185 Italian innovative startups
-
Satisficing Game Approach to Conflict Resolution for Cooperative Aircraft Sharing Airspace Big Data (IF 3.644) Pub Date : 2020-12-01 Longfang Mu; Songchen Han
Conflict resolution is one of the central tasks during the control of air traffic. In this article, we examined the problem of conflict resolution for unmanned aerial vehicle (UAV) integration into national airspace system and presented an approach based on satisficing game theory to conflict resolution for cooperative UAVs and manned aircraft sharing airspace. This approach ensured the priority of
-
Bridging the Brain and Data Sciences Big Data (IF 3.644) Pub Date : 2020-11-18 John Darrell Van Horn
Brain scientists are now capable of collecting more data in a single experiment than researchers a generation ago might have collected over an entire career. Indeed, the brain itself seems to thirst for more and more data. Such digital information not only comprises individual studies but is also increasingly shared and made openly available for secondary, confirmatory, and/or combined analyses. Numerous
-
Analyzing the Importance of Broker Identities in the Limit Order Book Through Deep Learning Big Data (IF 3.644) Pub Date : 2020-11-17 Samuel Ping-Man Choi; Yin-Hei Chan; Sze-Sing Lam; Hie-Yiin Hung
Limit order books (LOBs) have been widely adopted as a trading mechanism in global securities markets, and the degree of LOB transparency is one of the most studied topics in market design. In the past, this issue was mainly researched through the comparison of LOB transparency in a market before and after a policy change, although such instances were rare and occurred decades ago. This article analyzes
-
Finding Path Motifs in Large Temporal Graphs Using Algebraic Fingerprints Big Data (IF 3.644) Pub Date : 2020-10-19 Suhas Thejaswi; Aristides Gionis; Juho Lauri
We study a family of pattern-detection problems in vertex-colored temporal graphs. In particular, given a vertex-colored temporal graph and a multiset of colors as a query, we search for temporal paths in the graph that contain the colors specified in the query. These types of problems have several applications, for example, in recommending tours for tourists or detecting abnormal behavior in a network
-
Classifying Dissemination Processes in Temporal Graphs Big Data (IF 3.644) Pub Date : 2020-10-19 Lutz Oettershagen; Nils M. Kriege; Christopher Morris; Petra Mutzel
Many real-world graphs are temporal, for example, in a social network, persons only interact at specific points in time. This temporal information directs dissemination processes on the graph, such as the spread of rumors, fake news, or diseases. However, the current state-of-the-art methods for supervised graph classification are mainly designed for static graphs and may not capture temporal information
-
Graph Neural Network-Based Diagnosis Prediction. Big Data (IF 3.644) Pub Date : 2020-10-19 Yang Li,Buyue Qian,Xianli Zhang,Hui Liu
Diagnosis prediction is an important predictive task in health care that aims to predict the patient future diagnosis based on their historical medical records. A crucial requirement for this task is to effectively model the high-dimensional, noisy, and temporal electronic health record (EHR) data. Existing studies fulfill this requirement by applying recurrent neural networks with attention mechanisms
-
LTSpAUC: Learning Time-Series Shapelets for Partial AUC Maximization Big Data (IF 3.644) Pub Date : 2020-10-19 Akihiro Yamaguchi; Shigeru Maya; Kohei Maruchi; Ken Ueno
Shapelets are discriminative segments used to classify time-series instances. Shapelet methods that jointly learn both classifiers and shapelets have been studied in recent years because such methods provide both interpretable results and superior accuracy. The partial area under the receiver operating characteristic curve (pAUC) for a low range of false-positive rates (FPR) is an important performance
-
NSVD: Normalized Singular Value Deviation Reveals Number of Latent Factors in Tensor Decomposition. Big Data (IF 3.644) Pub Date : 2020-10-19 Yorgos Tsitsikas,Evangelos E Papalexakis
Tensor decomposition has been shown, time and time again, to be an effective tool in multiaspect data mining, especially in exploratory applications where the interest is in discovering hidden interpretable structure from the data. In such exploratory applications, the number of such hidden structures is of utmost importance since incorrect selection may imply the discovery of noisy artifacts that
-
Physics-Guided Deep Learning for Drag Force Prediction in Dense Fluid-Particulate Systems Big Data (IF 3.644) Pub Date : 2020-10-19 Nikhil Muralidhar; Jie Bu; Ze Cao; Long He; Naren Ramakrishnan; Danesh Tafti; Anuj Karpatne
Physics-based simulations are often used to model and understand complex physical systems in domains such as fluid dynamics. Such simulations, although used frequently, often suffer from inaccurate or incomplete representations either due to their high computational costs or due to lack of complete physical knowledge of the system. In such situations, it is useful to employ machine learning (ML) to
-
Quantifying Insurance Agency Channel Dynamics Using Premium Sales Big Data and External Factors Big Data (IF 3.644) Pub Date : 2020-10-08 Erdem Kaya; Eray Alpan; Selim Balcisoy; Burcin Bozkaya
In insurance business, product sales can be realized over a variety of channels such as independent agencies, or bank branches. In 2017, 55% of premium production was generated over insurance agencies in Turkey making independent agency evaluation prominent in the domain. Unfortunately lacking attention from the scientific community, agency evaluation problem is usually tackled in the industry by utilizing
-
A Convolutional Neural Network to Perform Object Detection and Identification in Visual Large-Scale Data Big Data (IF 3.644) Pub Date : 2020-09-29 Riadh Ayachi; Yahia Said; Mohamed Atri
In recent years, big data became a hard challenge. Analyzing big data needs a lot of speed precision combination. In this article, we describe a deep learning-based method to deal with big data with a focus on precision and speed. In our case, the data are images that are the hardest type of data to manipulate because of their complex structure that needs a lot of computation power. Besides, we will
-
The Coming of Age for Big Data in Systems Radiobiology, an Engineering Perspective Big Data (IF 3.644) Pub Date : 2020-09-29 Christos Karapiperis; Anastasia Chasapi; Lefteris Angelis; Zacharias G. Scouras; Pier G. Mastroberardino; Soile Tapio; Michael J. Atkinson; Christos A. Ouzounis
As high-throughput approaches in biological and biomedical research are transforming the life sciences into information-driven disciplines, modern analytics platforms for big data have started to address the needs for efficient and systematic data analysis and interpretation. We observe that radiobiology is following this general trend, with -omics information providing unparalleled depth into the
-
Call for Special Issue Papers: Programming Models and Algorithms for Big Data. Big Data (IF 3.644) Pub Date : 2020-08-25 Fadi Al-Turjman,Walaa Hamouda,Shahid Mumtaz
-
HONEM: Learning Embedding for Higher Order Networks. Big Data (IF 3.644) Pub Date : 2020-08-17 Mandana Saebi,Giovanni Luca Ciampaglia,Lance M Kaplan,Nitesh V Chawla
Representation learning on networks offers a powerful alternative to the oft painstaking process of manual feature engineering, and, as a result, has enjoyed considerable success in recent years. However, all the existing representation learning methods are based on the first-order network, that is, the network that only captures the pairwise interactions between the nodes. As a result, these methods
-
MonkeyKing: Adaptive Parameter Tuning on Big Data Platforms with Deep Reinforcement Learning. Big Data (IF 3.644) Pub Date : 2020-08-17 Haizhou Du,Ping Han,Qiao Xiang,Sheng Huang
Choosing the right parameter configurations for recurring jobs running on big data analytics platforms is difficult because there can be hundreds of possible parameter configurations to pick from. Even the selection of parameter configurations is based on different types of applications and user requirements. The difference between the best configuration and the worst configuration can have a performance
-
Fuzzy Inspired Deep Belief Network for the Traffic Flow Prediction in Intelligent Transportation System Using Flow Strength Indicators. Big Data (IF 3.644) Pub Date : 2020-08-17 Shiju George,Ajit Kumar Santra
Intelligent transportation system (ITS) is an advance leading edge technology that aims to deliver innovative services to different modes of transport and traffic management. Traffic flow prediction (TFP) is one of the key macroscopic parameters of traffic that supports traffic management in ITS. Growth of the real-time data in transportation from various modern equipments, technology, and other resources
-
Coronavirus Optimization Algorithm: A Bioinspired Metaheuristic Based on the COVID-19 Propagation Model. Big Data (IF 3.644) Pub Date : 2020-08-17 F Martínez-Álvarez,G Asencio-Cortés,J F Torres,D Gutiérrez-Avilés,L Melgar-García,R Pérez-Chacón,C Rubio-Escudero,J C Riquelme,A Troncoso
This study proposes a novel bioinspired metaheuristic simulating how the coronavirus spreads and infects healthy people. From a primary infected individual (patient zero), the coronavirus rapidly infects new victims, creating large populations of infected people who will either die or spread infection. Relevant terms such as reinfection probability, super-spreading rate, social distancing measures
-
MRS-DP: Improving Performance and Resource Utilization of Big Data Applications with Deadlines and Priorities. Big Data (IF 3.644) Pub Date : 2020-08-17 Utsav Upadhyay,Geeta Sikka
This article proposes the MapReduce scheduler with deadline and priorities (MRS-DP) scheduler capable of handling jobs with deadlines and priorities. Big data have emerged as a key concept and revolutionized data analytics in the present era. Big data are characterized by multiple dimensions or Vs, namely volume, velocity, variety, veracity, and valence. Recently, a new and important dimension (another
-
Call for Special Issue Papers: Evaluation and Experimental Design in Data Mining and Machine Learning. Big Data (IF 3.644) Pub Date : 2020-08-01 Eirini Ntoutsi,Erich Schubert,Arthur Zimek,Albrecht Zimmermann
-
Fog-Based Delay-Sensitive Data Transmission Algorithm for Data Forwarding and Storage in Cloud Environment for Multimedia Applications. Big Data (IF 3.644) Pub Date : 2020-07-14 Azath Mubarakali,Anand Deva Durai,Mohmmed Alshehri,Osama AlFarraj,Jayabrabu Ramakrishnan,Dinesh Mavaluru
Fog computing is playing a vital role in data transmission to distributed devices in the Internet of Things (IoT) and another network paradigm. The fundamental element of fog computing is an additional layer added between an IoT device/node and a cloud server. These fog nodes are used to speed up time-critical applications. Current research efforts and user trends are pushing for fog computing, and
-
Call for Special Issue Papers: Soft Computing Models for Big Data and Internet of Things. Big Data (IF 3.644) Pub Date : 2020-06-19 Naveen Chilamkurti,Anand Paul,Akshi Kumar
-
Call for Special Issue Papers: Internet of Things Data Visualization for Business Intelligence. Big Data (IF 3.644) Pub Date : 2020-06-19 Neeraj Kumar
-
FakeNewsNet: A Data Repository with News Content, Social Context, and Spatiotemporal Information for Studying Fake News on Social Media. Big Data (IF 3.644) Pub Date : 2020-06-01 Kai Shu,Deepak Mahudeswaran,Suhang Wang,Dongwon Lee,Huan Liu
Social media has become a popular means for people to consume and share the news. At the same time, however, it has also enabled the wide dissemination of fake news, that is, news with intentionally false information, causing significant negative effects on society. To mitigate this problem, the research of fake news detection has recently received a lot of attention. Despite several existing computational
-
A Novel Oppositional Chaotic Flower Pollination Optimization Algorithm for Automatic Tuning of Hadoop Configuration Parameters. Big Data (IF 3.644) Pub Date : 2020-06-01 Vidhyasagar Bellamkonda Sathyanarayanan,Raja Paul Perinbam Jeevarathinam,Krishnamurthy Marudhamuthu
At present, due to the introduction of the big data era, numerous numbers of data are generated consistently. Many applications utilize big data platforms, namely Spark, Hadoop, Amazon web services, and so on, since these platforms use several parameters for tuning that further enhance the operating performances. It requires a long duration of time to tune the parameters because of the complex relationship
-
Configuring Parallelism for Hybrid Layouts Using Multi-Objective Optimization. Big Data (IF 3.644) Pub Date : 2020-06-01 Rana Faisal Munir,Alberto Abelló,Oscar Romero,Maik Thiele,Wolfgang Lehner
Modern organizations typically store their data in a raw format in data lakes. These data are then processed and usually stored under hybrid layouts, because they allow projection and selection operations. Thus, they allow (when required) to read less data from the disk. However, this is not very well exploited by distributed processing frameworks (e.g., Hadoop, Spark) when analytical queries are posed
-
Community Detection in Social Networks Using Affinity Propagation with Adaptive Similarity Matrix. Big Data (IF 3.644) Pub Date : 2020-06-01 Sona Taheri,Asgarali Bouyer
Community detection problem is a projection of data clustering where the network's topological properties are only considered for measuring similarities among nodes. Also, finding communities' kernel nodes and expanding a community from kernel will certainly help us to find optimal communities. Among the existing community detection approaches, the affinity propagation (AP)-based method has been showing
-
Moth-Flame Optimization-Bat Optimization: Map-Reduce Framework for Big Data Clustering Using the Moth-Flame Bat Optimization and Sparse Fuzzy C-Means. Big Data (IF 3.644) Pub Date : 2020-05-19 Vasavi Ravuri,S Vasundra
The technical advancements in big data have become popular and most desirable among users for storing, processing, and handling huge data sets. However, clustering using these big data sets has become a major challenge in big data analysis. The conventional clustering algorithms used scalable solutions for managing huge data sets. Thus, this study proposes a technique for big data clustering using
-
Call for Special Issue Papers: Internet of Things Data Visualization for Business Intelligence. Big Data (IF 3.644) Pub Date : 2020-05-05 Neeraj Kumar
-
Call for Special Issue Papers: Soft Computing Models for Big Data and Internet of Things. Big Data (IF 3.644) Pub Date : 2020-04-30 Naveen Chilamkurti,Anand Paul,Akshi Kumar
-
Call for Special Issue Papers: Multimedia Big Data Analytics for Engineering Education. Big Data (IF 3.644) Pub Date : 2020-04-28 Priyan Malarvizhi Kumar,Hari Mohan Pandey,Gautam Srivastava
-
The Evolution of Publication Hotspots in Electronic Health Records from 1957 to 2016 and Differences Among Six Countries. Big Data (IF 3.644) Pub Date : 2020-04-17 Yanjun Wang,Ye Zhao,Weijia Dang,Jianzhong Zheng,Haiyuan Dong
This study aims to reveal the evolution of publication hotspots in the field of electronic health records (EHRs) and differences among countries. We applied keyword frequency analysis, keyword co-occurrence analysis, principal component analysis, multidimensional scaling analysis, and visualization technology to compare the high-frequency Medical Subject Heading (MeSH) terms in six countries during
-
CDNB: CAVIAR-Dragonfly Optimization with Naive Bayes for the Sentiment and Affect Analysis in Social Media. Big Data (IF 3.644) Pub Date : 2020-04-17 Harshali P Patil,Mohammad Atique
With the advent of the new information technologies, the growth of online reviews regarding an organization or a company or any other sector has been playing a vital role in improving the sector plans and decisions. The vast significance of the online reviews that determine the sentiment polarity is the hectic challenge of the current scenario. Sentiment classification is a process of classifying the
-
SecDedoop: Secure Deduplication with Access Control of Big Data in the HDFS/Hadoop Environment. Big Data (IF 3.644) Pub Date : 2020-04-17 P Ramya,C Sundar
With the rapid growth of storage providers, data deduplication is an essential storage optimization technique that greatly minimizes data storage costs by storing a unique copy of duplicate data. Nowadays, deduplication introduces various new challenges such as security and insufficient space issue. Hence, in this article, we propose a secure data deduplication with access control of big data over
-
Optimal Feature Selection for Big Data Classification: Firefly with Lion-Assisted Model. Big Data (IF 3.644) Pub Date : 2020-04-17 Ramar Senthamil Selvi,Muniyappan Lakshapalam Valarmathi
In this article, the proposed method develops a big data classification model with the aid of intelligent techniques. Here, the Parallel Pool Map reduce Framework is used for handling big data. The model involves three main phases, namely (1) feature extraction, (2) optimal feature selection, and (3) classification. For feature extraction, the well-known feature extraction techniques such as principle
-
Call for Special Issue Papers: Multimedia Big Data Analytics for Engineering Education. Big Data (IF 3.644) Pub Date : 2020-04-01 Priyan Malarvizhi Kumar,Hari Mohan Pandey,Gautam Srivastava
-
Call for Special Issue Papers: Big Data in Business. Big Data (IF 3.644) Pub Date : 2020-02-01 Haitham Nobanee
-
Certification or Advanced Degrees. Big Data (IF 3.644) Pub Date : 2020-02-01 Dan Holle
The value of training for a data sciences professional is in the eye of the beholder. And dependent on the scope and breadth of that training and the cost and time frame of that training. Value for the employee may differ from value for the employer. The lens is different and value may depend on what lens you look through. Training can be online or on-site, short term with specific focus or longer
-
Stock Market Prediction Using Optimized Deep-ConvLSTM Model. Big Data (IF 3.644) Pub Date : 2020-02-01 Amit Kelotra,Prateek Pandey
Stock market prediction acts as a challenging area for the investors for obtaining the profits in the financial markets. A greater number of models used in stock market forecasting is not capable of providing an accurate prediction. This article proposes a stock market prediction system that effectively predicts the state of the stock market. The deep convolutional long short-term memory (Deep-ConvLSTM)
-
A Web Application for Interactive Visualization of European Basketball Data. Big Data (IF 3.644) Pub Date : 2020-02-01 Guillermo Vinué
The statistical analysis of basketball games is a fast-growing field. Certainly, basketball data are scientifically relevant because an appropriate analysis provides a great deal of information about the performance of both players and teams. The number of games played each season generates a large amount of data worth analyzing. Basketball analytics is well established in U.S. leagues. In Europe,
-
SOOM: Sort-Based Optimizer for Big Data Multi-Query. Big Data (IF 3.644) Pub Date : 2020-02-01 Radhya Sahal,Mohammed H Khafagy,Fatma A Omara
Mostly, sorting of data is a common operation in many applications, which causes the consumption of resources and thus leads to computation overheads. Regarding the context of Big Data multi-query, the shared sort operations are fairly large, which incur high-cost I/Os whether explicit or implicit. In particular, Big Data multi-query, including aggregation and sort operations, takes long execution
-
STDADS: An Efficient Slow Task Detection Algorithm for Deadline Schedulers. Big Data (IF 3.644) Pub Date : 2020-02-01 Utsav Upadhyay,Geeta Sikka
The MapReduce programming model was designed and developed for Google File System to efficiently process large-scale distributed data sets. The open source implementation of this Google project was called the Apache Hadoop. Hadoop architecture includes Hadoop MapReduce and Hadoop Distributed File System (HDFS). HDFS supports Hadoop in effectively managing data sets over the cluster and MapReduce programming
-
Using Behavioral Analytics to Predict Customer Invoice Payment. Big Data (IF 3.644) Pub Date : 2020-02-01 Mohsen Bahrami,Burcin Bozkaya,Selim Balcisoy
Experiences from various industries show that companies may have problems collecting customer invoice payments. Studies report that almost half of the small- and medium-sized enterprise and business-to-business invoices in the United States and United Kingdom are paid late. In this study, our aim is to understand customer behavior regarding invoice payments, and propose an analytical approach to learning
-
Mining the Thin Air—for Understanding of Urban Society Big Data (IF 3.644) Pub Date : 2019-12-01 Ron Bekkerman; Adi Zmirli; Scott Kirkpatrick
We explore the potential of crowd-sourced information on human mobility and activities in an urban population drawn from a significant fraction of smartphones in the Los Angeles basin during February-May 2015. The raw dataset was collected by WeFi, a smartphone app provider. The dataset is noisy, irregular, and lean; however, it is large scale (over a billion events), cheap to collect, and arguably
-
Deep Learning on Big, Sparse, Behavioral Data Big Data (IF 3.644) Pub Date : 2019-12-01 Sofie De Cnudde; Yanou Ramon; David Martens; Foster Provost
The outstanding performance of deep learning (DL) for computer vision and natural language processing has fueled increased interest in applying these algorithms more broadly in both research and practice. This study investigates the application of DL techniques to classification of large sparse behavioral data-which has become ubiquitous in the age of big data collection. We report on an extensive
-
Transforming Finance Into Vision: Concurrent Financial Time Series as Convolutional Nets Big Data (IF 3.644) Pub Date : 2019-12-01 Vasant Dhar; Chenshuo Sun; Puneet Batra
We present a novel representation for multiple synchronized financial time series as images, motivated by deep learning methods in machine vision. The research pursues two related strands of inquiry. The first is to transform concurrent synchronized time series analysis-one that is prevalent in Finance and other domains-into a machine vision problem so that the standard deep learning machinery such
-
An Experience-Centered Approach to Training Effective Data Scientists Big Data (IF 3.644) Pub Date : 2019-12-01 Kit T. Rodolfa; Adolfo De Unanue; Matt Gee; Rayid Ghani
Like medicine, psychology, or education, data science is fundamentally an applied discipline, with most students who receive advanced degrees in the field going on to work on practical problems. Unlike these disciplines, however, data science education remains heavily focused on theory and methods, and practical coursework typically revolves around cleaned or simplified data sets that have little analog
-
Effects of Distance Measure Choice on K-Nearest Neighbor Classifier Performance: A Review. Big Data (IF 3.644) Pub Date : 2019-12-01 Haneen Arafat Abu Alfeilat,Ahmad B A Hassanat,Omar Lasassmeh,Ahmad S Tarawneh,Mahmoud Bashir Alhasanat,Hamzeh S Eyal Salman,V B Surya Prasath
The K-nearest neighbor (KNN) classifier is one of the simplest and most common classifiers, yet its performance competes with the most complex classifiers in the literature. The core of this classifier depends mainly on measuring the distance or similarity between the tested example and the training examples. This raises a major question about which distance measures to be used for the KNN classifier
-
Interview with Dr. Silvio Carta, Author of the Book Big Data, Code and the Discrete City (Routledge 2019). Big Data (IF 3.644) Pub Date : 2019-11-01 Silvio Carta
Contents have been reproduced by permission of the publishers.