1 Introduction

There is a digital revolution in businesses across sectors and geographies. There is a need for both traditional and new business models to capitalize on the digital technologies to overcome existential threats and gain access to game-changing opportunities. The digitalization revolution has gained momentum on the back of the Covid-19 pandemic which has forced businesses to reshape their business models. The availability of digital technologies and vastly spread Internet connectivity is leading to large-scale digital transformations, enabling organizations to redefine their business models (Van Tonder et al. 2020). Digital transformation involves two steps, namely, digitization and digitalization (Verhoef et al. 2021). Digitization is the process of transforming information from analogue to the digital mode. Digitalization refers to the process of leveraging digital technologies to redefine business models to benefit from digital business opportunities.

For successful implementation of digital transformation initiatives in shaping the next generation of products and services, organizations need a coherent and well-crafted digital strategy supported by technology and customer experience (Erceg and Zoranović 2020; Pappas et al. 2018; Sebastian et al. 2017; Zaki 2019). This strategy has to be enabled by digital capabilities (Westerman 2018). Any digital strategy encompasses the exploration, application, and large-scale adoption of technologies, such as Social, Mobile, Analytics, Cloud, and IoT (SMACIT), Virtual Reality, Blockchain, 3D Printing, Drones, and Augmented Reality. Adopting the right digital strategy will enable organizations in remaining competitive, overcoming the challenges of digitalization, and taking advantage of opportunities (Becker and Schmid 2020).

The digital strategy augments the organizational strategy by transitioning from predicting and planning to experimenting and responding. It also helps in agile planning, inclusive responsibilities, and utilization of IT capabilities. Digital solutions strategy often includes establishing ‘Data Science’ capabilities (Dremel et al. 2017). The rapid increase in digitization is opening up avenues for the large-scale capture, storage, and analysis of business data, which if optimally used with ‘Data Science’ tools and technologies can unlock significant business benefits. ‘Data Science’ plays a vital role in harnessing data from all organizational touchpoints and analyzing it to generate business insights across the domains of R&D, manufacturing, marketing, operations, supply chain, customer relationship management, strategy formulation, finance, human resource management, and others.

Driven by technological advancements, the increased ability to collect and process large amounts of data to extract business insights has led to a rush towards the implementation of ‘Data Science’ initiatives. It is expected that early adopters of Artificial Intelligence (AI) will share a global profit pool of $1 trillion by 2030 (Bughin 2018). Though the importance of ‘Data Science’ is widely acknowledged, most organizations are still in the early stages of adoption. About 75% to 80% of organizations (Deloitte’s Analytics Advantage Survey, 2013; Forbes report, 2015) have failed to successfully deploy ‘Data Science’ (Ghasemaghaei et al. 2018; Mazzei and Noble 2017).

In 2015, more than 75% of organizations either invested or planned to invest in ‘Data Science’ (Sun et al. 2020). However, more than 95% (2017 survey by McKinsey Global Institute and Digital@McKinsey) of the companies across domains and geography have not embraced AI to reinventing their business operations (Bughin 2018). As of 2017, while more than 80% of the organizations see an opportunity in adopting AI strategically, only 23% of them have adopted AI, and another 23% have started with pilot projects in AI (Ransbotham et al. 2017). By 2019, though AI adoption grew by 270% over the last four years (2015–2018), only 37% of the organizations have actually deployed AI or at least have short-term plans of embracing it (Rowell- Jones and Howard, 2019: 2019 Gartner CIO Survey). Amongst the ‘Data Science’ technology adopters, a large section has not achieved the desired Return on Investments (ROI) or repeated success from adoption (Braganza et al. 2017).

In the recent past, GE failed to implement a digital strategy planned towards digital transformation and as a result, had to lay off employees supporting its digital strategy (Colvin 2018). In similar lines, Nike could not reap benefits from customer digital data that was intended to provide feedback and suggestions on consumer lifestyle (de Swaan Arons et al. 2014). Nike had to discontinue its Nike + products (Correani et al. 2020). This implies that despite much hype (Fox and Do 2013) about ‘Data Science’ and its benefits, the successful implementation in organizations and in public sector (Vydra and Klievink 2019), and achieving the desired outcomes remain a challenge (Tabesh et al. 2019). We attribute this challenge to the lack of a well-crafted, holistic, distinct and robust 'Data Science Strategy' and/or failure in its implementation.

The evidence from reviewed articles suggests a lack of concise framework for understanding the enablers and barriers of ‘Data Science Strategy’ in enterprises. Therefore, the main purpose of this article is to synthesize all the enablers and barriers affecting the formulation and implementation of ‘Data Science Strategy’ in organizations across different industries, sectors, and geographies. To begin with, the structured literature survey is carried out to summarize the learnings from extant literature. The literary content is then segmented under appropriate themes. The study also proposes a conceptual framework for the successful implementation of ‘Data Science’ projects which would also be of use to future researchers with significant agenda for ‘Data Science Strategy’ research.

The learnings from this “Systematic Literature Review (SLR)” can be used by both academic researchers and practitioners in the field of ‘Data Science’ and ‘Information System’ Strategy. Academic researchers can explore the gap in existing literature in the area of ‘Data Science Strategy’ and attempt empirical studies on areas where the gaps have been identified. Practitioners such as ‘Data Science’ management, ‘Chief Data Science Officers’ can use our findings to understand the challenges, issues, best practices, and the possibilities of implementing ‘Data Science Strategy’ in their organizations.

Specifically, this paper focuses on the following research questions with respect to Data Science: What are the enablers and barriers that affect the strategic implementation and success of ‘Data Science’ in organizations?

The remaining sections of the paper are organized as follows: The background section presents the contextual information on ‘Data Science’ and its constituents, ‘Data Science Strategy’, and Contexts and associated factors. The research method section then discusses the methodology used in our literature review. This is followed by the results section, which presents the synthesized findings and conceptual framework. The discussion section details the various organizational approaches to ‘Data Science’ adoption, theoretical contributions and managerial implications. The limitations and future research section details the limitations of the study and the avenues for future research. Finally, the conclusion section summarizes the findings of the study.

2 Background

The definition of ‘Data Science’ is constantly evolving with the continuous advancement in technological trends. ‘Data Science’ is a cyclical process of capturing the business needs and acquiring the relevant data, storage, security, privacy, preparation, pre-processing, analytics; generating and communicating insights; and finally, actuating these actionable insights (Jägare 2019). ‘Data Science’ is an umbrella term that cuts across constituents such as Artificial Intelligence (AI), Machine Learning (ML), Big Data (BD), Big Data Analytics (BDA), Visualization, and Business Intelligence & Analytics (BI&A), Mathematics, Computer Science & Programming Skills, and Domain Knowledge (Fig. 1).

Fig. 1
figure 1

‘Data Science’ constituents. The size and shape of each circle (or oval) in Fig. 1 is only representative, drawn in convenience to adjust the text and not indicative of any weightage to the ‘Data Science Constituents.’

‘Data Science Strategy (DSS)’ refers to the overall organizational strategy signifying its ‘Data Science' investment. It includes the overall Data Science objectives, strategic choices, regulatory requirements, data strategy (data management including, acquisition, storage, security, privacy, ethics, and data governance), resource management, competency build-up, infrastructure planning. This strategy also defines the means to track Key Performance Indicators (KPI) and Return on Investments (ROI) (Jägare 2019).

The adoption and successful implementation factors of ‘Data Science Strategy’ vary between industries and, more importantly, on the aligned business objectives. Currently, the Covid-19 pandemic is testing the resilience of organizations by challenging organizational design and work practices. In giant organizations (viz. Amazon, Google, AirBnB), Small and Medium Enterprises (SMEs), and start-ups, the key focus is on refining the current organizational strategy to integrate digital-enabled exponential systems (George et al. 2020). In context of the growing debate on automation replacing jobs, Soto-Acosta (2020) note that only 20% of activities are automated and the rest are augmented by digitalization. To make automation possible, more jobs need to be created in the area of ‘Data Science’ in order to perform the majority of tasks remotely. The Covid-19 pandemic is acting as a catalyst for the digital transformation of established firms as well as startups, forcing them to innovate in order to survive. Even traditional companies with no or limited digital experiences are adopting digital technologies in order to stay relevant in the market. This acceleration in the adoption of digital technologies finds reflection in the revised way of techniques for data collection, online appointments and therapies, work-from-home, smart-homes, schooling, learning and social interconnect activities (Hantrais et al. 2020; Maalsen and Dowling 2020).

Extant literature has studied ‘Data Science Strategy’ in the contexts of dynamic market places, organizational and dynamic capabilities (Knabke & Olbrich, 2018), innovations (Mikalef et al. 2018, 2019a), and new product development processes (Johnson et al. 2017). Studies have been conducted in the context of but not limited to healthcare (Chen and Banerjee 2020; Kamble et al. 2019; Kemppainen et al. 2019; Li et al. 2021; Newlands et al. 2020; Ramnath et al. 2020; Yang et al. 2015; Wang and Hajli 2017), B2B (Hallikainen et al. 2020), construction (Ahmed et al. 2018; Ram et al. 2019; Sang et al. 2020), supply chain management (Ali et al. 2020; Arunachalam et al. 2018; Brinch et al. 2018; Dubey et al. 2019a; Khan 2019; Lai et al. 2018; Lamba and Singh 2018; Mandal 2019; Singh and Singh 2019; Wang et al. 2018c), manufacturing (Popovič et al. 2018; Verma 2017), consumer goods (Rialti et al. 2018); e-commerce (Behl et al. 2019; Wamba et al. 2017), telecommunications (Saldžiūnas and Skyrius 2017; Walker and Brown 2019), banking and financial services (Lee et al. 2017; Lautenbach et al. 2017; Gregory 2011), automotive (Dremel et al. 2017), and airlines (Holland et al. 2020).

Attempts have also been made in the past to conduct SLR to identify various components of Data Science from a strategy point of view. These review articles from the strategic adoption point have focused highly on individual components of ‘Data Science’. In the recent past ‘Big Data’ (Ciampi et al. 2020; Günther et al. 2017; Mikalef et al. 2016, 2020a; Nelson and Olovsson 2016; Olszak and Mach-Król 2018; Wamba et al. 2015; Zhao-hong et al. 2018), ‘Big Data and dynamic capabilities’ (Rialti et al. 2019), ‘Big Data business models’ (Huberty 2015; Wiener et al. 2020), and ‘Big Data assessment models’ (Adrian et al. 2016) have been explored. Researchers have also focused upon ‘Big Data Analytics’ (Adrian et al. 2017; Al-Sai et al. 2020; Bogdan and Lungescu 2018; Inamdar et al. 2020; Maroufkhani et al. 2019; Mikalef et al. 2019a; Singh et al. 2021; Sivarajah et al. 2017), Artificial Intelligence (Alsheiabni et al. 2020; Borges et al. 2021; Keding 2020; Kitsios and Kamariotou 2021; Markus 2017), Business Analytics (Cao and Duan 2017), and Business Intelligence and Analytics (BI&A) (Chen et al. 2012; Eggert and Alberts 2020; Lautenbach et al. 2017; Llave 2017; Moreno et al. 2019; Sang et al. 2020).

The literature on ‘Data Science Strategy’ under single-unified umbrella term that cuts across different technologies, such as ‘Big Data’, ‘Data Analytics’, ‘Artificial Intelligence’, ‘Machine Learning’, ‘Deep Learning’, ‘Visualization’, and ‘Business Intelligence’ is considerably scarce. In contrast, studies on ‘Data Science’ adoption and implementation in enterprises with corresponding barriers and enablers across various industry sectors and domains are limited. The conceptual framework for holistic ‘Data Science Strategy’ is absent to our limited understanding. To be successful in deriving desired business outcomes organizations need to recognize the need for a well-crafted and all-encompassing ‘Data Science Strategy’ augmenting digitalization and digital transformation.

3 Research method: a systematic literature review

The SLR is focused on identifying and synthesizing the knowledge on enablers and barriers influencing ‘Data Science Strategy’ in organizations. It takes into consideration the accessible scholarly business management research articles in the English language from four databases, namely, EBSCO business source ultimate, Scopus, Science Direct, and Pro-Quest databases. These databases are chosen since they have the most relevant and reputed peer-reviewed list of journals (by renowned publications like Springer, Emerald, IEEE, Elsevier, Taylor and Francis, etc.) where ‘Data Science Strategy’ research is traditionally published. Also, the large number of past SLRs (Mikalef et al., 2018; Dam et al., 2019; Sivarajah et al., 2017; Eggert and Alberts, 2020; Arunachalam et al., 2018) focused on the AI, ML, BDA domains too have considered these scholarly databases. The articles were identified based on the keyword search in title, abstract, and article keywords sections. In total, the search yielded 573 documents. The duplicates were excluded leaving a total of 480 articles for consideration. In the next step, the number of relevant articles was further narrowed down based on examination of title and abstracts and this brought down the number to 158 articles. The examination was focused towards characterizing factors influencing the ‘Data Science Strategy’ implementation in organizations. The systematic process (Fig. 2), based on further in-depth analyses for relevance and quality of articles, resulted in a final literature base comprising of 113 articles, leading to analyses of findings, synthesis, conceptual framework and research agenda for future studies.

Fig. 2
figure 2

Methodology and output

4 Results

4.1 Findings

This article summarizes the research in the field during the period 1998–2021.However, there have been more articles published on the subject during 2015–2021., Driven by the advancements in ‘Information Technology Capabilities’ and reduction in cost of computations, organizations of all sizes and types in the last quarter of twentieth century began to adopt digital technology to increase their competitiveness (Khosrowpour, 1990). Their focus was to implement business applications evolving from technological advancements in data associated fields. The ‘Big Data’ and ‘Data Science’ terms were coined in the year 2007 and 2008, respectively, and ‘Big Data Analytics’ gained importance in the year 2011 (Nguyen et al. 2018). As a result, both academic and practitioners shifted attention to the adoption and implementation of ‘Data Science’ strategies.

After the systematic process, as described in the research methodology section, this paper included 79 distinct journals in this literature review. Out of the 113 articles selected for the review, 17 were from A*-category (according to the 2019 ABDC journal quality list), followed by 37 in A category, 15 in B category, and 17 in C category journals. Likewise, 16 articles were indexed in the Scopus database. Further, 11 articles belonged neither to the ABDC quality list nor the Scopus database (Fig. 3).

Fig. 3
figure 3

Year-wise distribution of articles with publication ranking

Figure 3 also represents the distribution of articles between the years 1998 and 2021 (till the date of search). The number of articles on ‘Data Science Strategy’ have increased over the years. The year 2019 recorded a maximum number of 31 papers. The steep increase in the number of articles over the past five years is testimony to the importance of the subject and the interest it is generating amongst academicians and practitioners worldwide. With more and more businesses investing in data-driven technologies, clear-cut business objectives are still far from realization. Hence, despite the increased number of articles, this research domain is still an emerging one.

Figure 4 shows the distribution of articles based on the industry type with corresponding research methods. The maximum number of articles, that is 48 articles, deal with multiple industries. Of these, 28 are empirical studies. Individually, the ‘Healthcare’ industry dominates the publications with 09 articles, out of which four are empirical studies, four are case study articles, and one SLR. ‘Financial Services’ and ‘Construction’ industries have received attention in 06 and 05 articles, respectively.

Fig. 4
figure 4

Industry-wise distribution of articles with research methods. *Multi-Industry in the above plot refers to data collected from respondents (by means of interviews, case studies, and empirical investigations) belonging to multiple industries such as Technology & Entertainment, Web Service, Healthcare, Insurance, Manufacturing, Retail, Telecommunications, Agriculture, Banking, Transportation, Oil & Gas, Media, Consumer Goods, and many more

Figure 5 showcases the distribution of articles based on the domain of work with corresponding research methods. The supply chain management domain leads with 15 articles, of which eight articles are empirical studies. Manufacturing and policy studies are credited with 4 articles each Other domains with corresponding articles are also listed.

Fig. 5
figure 5

Domain-wise distribution of articles with research methods

The distinct ‘Enablers and Barriers’ of ‘Data Science Strategy’ for different industry sectors, and domains are segregated in supplementary Table S1. The applicability and intensity of each barrier towards ‘Data Science’ adoption and its success varies based on the industry sector and the domain areas. Table S1 provides a comprehensive list of barriers in each of the major industries and domains varying from airlines to construction, healthcare to financial services, manufacturing to e-commerce, and so on. The underpinning theories used in the reviewed literature along with the key enablers of ‘Data Science Strategy’ implementation are documented in supplementary Table S2. Reviewed literature has drawn upon many theories including Resource-Based View (RBV), Dynamic Capability View (DCV), Knowledge-Based View (KBV), Technology-Organization- Environment (TOE) framework, Organizational Learning Theory (OLT), Diffusion of Innovation Theory (DIT), Technology Acceptance Model (TAM), Task-Technology Fit (TTF), Information Systems Success Model (ISSM), Agency Theory (AT), Stakeholders Theory (ST), Institutional Theory (ITh), and many more.

Different research methods adopted by the reviewed articles are graphically summarized in supplementary Fig. S1. The findings imply that nine different research methods are used in general. The majority of studies involved ‘Analytical or Empirical’ studies followed by ‘Case Study-based’ methodologies and ‘Interviewing’. The other research methods include Theoretical, Conceptual, Focus Group Discussions, Delphi Studies and Expert Viewpoints. The article search on the subject area resulted in articles from 79 distinct reputed journal publications. The journal publications with more than one article are documented in supplementary Fig. S2.’British Journal of Management ‘ has a maximum six number of articles published in the research area followed by five articles each in’Engineering, Construction and Architectural Management’, and’Industrial Marketing Management Journals ‘. Four articles are published in’Information and Management Journal ‘ followed by three articles each in’Information Systems & e-Business Management ‘,’Information Systems Frontiers ‘, and’International Journal of Logistics Management.’ The underpinning theories (49 numbers) for the reviewed articles on ‘Data Science Strategy’ is highlighted in supplementary Fig. S3. Resource-Based View is extensively used by 26% of the published research articles. Dynamic Capability Theory (19%) is the next prominent theory followed by Technology-Organization-Environment, which finds a place in 7% of the articles. Around 39% of the articles are drawn upon by different other theories as highlighted in supplementary Table S2. The study region of reviewed articles is spread across different geographies, as documented in supplementary Fig. S4. The United States of America dominates the list with 15 articles followed by 8 studies based on Indian industries. Europe and China follow the list with 7 articles each studied in the respective regions. Other regions including Asia, Singapore, Canada, Slovenia, Saudi Arabia, Pakistan, Norway, Iran, Finland, Dutch, and Brazil contribute one article each.

4.2 Synthesis of ‘data science’ components

The reviewed articles categorize the ‘Data Science’ components based on underlying theoretical frameworks. Under the lens of RBV, they are classified under Tangible, Human, and Intangible resources (Gupta and George 2016; Mikalef and Gupta 2021). Tangible resources comprise of data (internal, external, and combined data), technology (Hadoop, NoSQL, etc.), and basic resources, such as time and investments. The technical (pertaining to data-specific) and managerial (analytical and business acumen) skills include human resources. The intangible resources indicate organization culture and learning abilities, which include data-driven approaches and Knowledge-Management Systems. Few studies (Sun et al. 2020; Verma and Bhattacharyya 2017) have adopted the TOE framework to categorize the ‘Data Science’ components under the context of technology (resources, competence, complexity, compatibility, and relative advantage), organization (Firm size, perceived costs, and organizational support), and environment (competition, partner readiness, industry type, and regulations). Few more studies are carried out under the Motivation-Opportunity-Ability (MOA) theoretical framework (Wang et al. 2018c). Motivation comprises of perceived ease and usefulness, external pressure, corporate culture and leadership support. Regulatory mechanisms, policies, information level, and associated risk define the opportunity aspect. The ability aspect points toward management capability, data talent, and infrastructure. Considering the merits of all these theoretical foundations and corresponding literature, in this SLR, the factors are synthesized (Fig. 6) into the following themes: 'Content' referring to Data Characteristics and Data Governance; 'Context' in technology and environmental aspects; 'Intent' toward aligning with the core strategy of the organization, managerial willingness, organizational agility, leadership, cultural aspects; and 'Outcome' as a business value in terms of 'Key Performance Indicators' (KPIs) and 'Return on Investments’ (RoI).

Fig. 6
figure 6

Synthesized factors associated with ‘Data Science Strategy’ in organizations

4.2.1 Data characteristics

In the successful implementation of ‘Data Science’ in an organization, data characteristics can add value to the firm. Transforming these data characteristics into a valuable proposition relies upon the firms’ understanding of these characteristics and how to deliver value through their use (Wright et al. 2019). Ranjan (2019) analyzed the problems presented by data characteristics under the context of 10 Vs—Volume, Variety, Velocity, Veracity, Variability, Validity, Visualization, Vulnerability, Volatility, and Value. Whereas Ghasemaghaei et al. (2018) and Wamba et al. (2015) characterized data using 7Vs, namely Volume, Velocity, Variety, Veracity, Value, Variability, and Visualization. Each of these characteristics is important under different contexts including the industry sector. It is necessary to keep data as driver. Keeping data as driver demands certain business processes to be able to effectively use the data (de Medeiros et al. 2020). These business processes and procedures used in collecting and analyzing ‘data’ take the implementation of ‘Data Science Strategy’ further ahead (Rialti et al. 2019).

The characteristic of an incidence reflecting the raw facts defines the data quality. The data quality comprises of correctness, completeness, relevancy, timeliness, clarity, consistency, ease of understanding, and accessibility. It greatly affects the results of ‘Data Science’ (Ghasemaghaei et al. 2018). The application of ‘Data Science’ would positively influence the business value, once data quality challenges are addressed (Wamba et al. 2019). Intention to adopt ‘Data Science’ is positively affected by data quality and an understanding of the benefits of data.

4.2.2 Governance

For warranting the data quality and leveraging its value, data governance is a potential approach (Gregory 2011; Grover et al. 2018; Mir et al. 2020; Braganza et al. 2017; Wiener et al. 2020). Governance includes areas such as organizational strategy, data quality & security, innovation applications, ‘Data Science’ architecture, and lifecycle (Fakhri et al. 2020). Data policies are deployed by organizations to support product innovation (Lacam 2020).

The data governance aims at adding value to the enterprise through risk mitigation plans and helps achieving compliance (Gregory 2011). Bertot et al. (2014) highlight the need to develop a data governance model with respect to ‘Data Science’ in the context of privacy, reuse, accuracy, archival, curation, platforms, architecture, and sharing policies across sectors. In the finance sector, in protecting the consumer rights and interests, it is conducive to regulate the use of personal information (Li & Yu 2020). The magnitude of influence of data security, privacy, and accuracy are in descending order on ‘Data Science’ adoption and are 54%, 36%, and 8%, respectively in the total measurement (Latif et al. 2019). Effective data exchange within and among network partners is a necessity to benefit from ‘Data Science’. Due to security and privacy challenges, the organizations can be hesitant in sharing data with partners (Günther et al. 2017). To establish close connections and harmonization with partners there needs to be balanced and controlled sharing of data (Coombs et al. 2020; Sivarajah et al. 2017). There exists a positive interplay between governance of BDA infrastructure and BDA capabilities (Bertello et al. 2020).

4.2.3 Technology

Over the last few years, immense progress is observed in the technology related to ‘Data Science’ (Gupta and George 2016). The ‘Data Science’ concept emphasizes effective deployment of technology and talent to capture and manage data in order to generate insights (Mikalef et al. 2020b). ‘Data Science’ technology capability includes infrastructure and human talent.

The core themes of technology capability in terms of infrastructure are connectivity, compatibility, and modularity (Akter et al. 2016). In import and export business enterprises, with the introduction of a new IT system, it is necessary to build an internal management system to improve operational efficiencies (Zhang 2021). Investing in infrastructure, such as data lake, analytics portfolio, and human talent generates business value (Grover et al. 2018). Bendre and Thool (2016) have highlighted the use of ‘Data Science’ in different domains and emphasized the need for technological platforms covering lifecycle management including data acquisition, processing and visualization challenges. The challenges arising from data characteristics in computing methods require different strategies and advanced techniques including technology usage on data segregation and accumulation, high-performance computing, incremental learning, scalability, and heuristics (Choi et al. 2018; Wang et al. 2018b).

Amongst large and diverse firms, it is observed that investing in ‘Data Science’ which is IT-intensive and with a highly competitive nature leads to improvement in productivity (Müller et al. 2018). Many organizations have transitioned from traditional analytics (Analytics 1.0—business intelligence) to ‘Big Data Analytics’ (Analytics 2.0). Using the ‘Data Science’ platform that can combine the traditional analytics and ‘Big Data’ organizations are transitioning to a new synthesis, known as the Analytics 3.0- data enriched offering (Davenport 2013; Harlow 2018).

Human talent is arguably the most important and critical element in leveraging ‘Data Science’ investments (Grover et al. 2018; Wamba et al. 2015). Talent capability includes technical, relational, and business knowledge, along with the ability to manage technology (Akter et al. 2016). The challenges in talent management include cognitive and behavioural limitations, and lack of educational programs for the development of both analytical and communication skills with regards to translating data into business insights (de Medeiros et al. 2020).

The implementation of a ‘Data Science’ value chain is driven by overall business directions and strategy, including the knowledge management strategy. In this context, ‘knowledge management’ refers to the process of sharing internally held information in an easy and systematic approach for the benefit of the organization. Developing a strategic intent and knowledge management system for ‘Data Science’ will lead to a long-term sustainable competitive advantage by allowing enterprises to implement new business models and develop large-scale ability to experiment (Harlow 2018). Relative performance in firms that develop more ‘Data Science’ capabilities than others is higher. Further, knowledge management orientation plays a significant role in amplifying the effect of ‘Data Science’ capabilities (Ferraris et al. 2019). Knowledge management capabilities and organizational ambidexterity (exploration and exploitation) influence the relationship between ‘Data Science’ and strategic flexibility (Rialti et al. 2020). The synergy between organizational resources and capabilities in complementing ‘Data Science’, improves organizational objectives and outcomes (Wang et al. 2019). Poeppelbuss et al. (2011) studied maturity models describing organizational capabilities from the research, publication and practitioner’s perspectives. Grossman (2018) introduced the Analytic Processes Maturity Model (APMM) framework to evaluate the analytic maturity of an organization.

4.2.4 Business environment

The business environment reflects industry characteristics and government regulations. The industry is characterized by the firms' partners and competitors, and the government policies influence the macroeconomics context and the regulations (Sun et al. 2020). A favourable industry and regulatory environment (infrastructure, legal, regulations, directives, support, competitive pressure, and trading partner readiness) positively affects the ‘Data Science’ adoption intent (Sun et al. 2020; Wiener et al. 2020). Dubey et al. (2019b) developed and tested the model describing the relation between resources, talent, culture, cost, and operational performance with institutional pressures (coercive, normative, and mimetic).

To successfully adopt ‘Data Science’ in the organizational context, firms should take the internal teams onboard the digital transformation journey, while reinforcing strategic initiatives and influencing employee behaviour towards their successful implementation (Boldosova 2019). In addition to its technical capacity, the power dynamics within a firm and its competitive landscape play a significant role in ‘Data Science’ build-up (Jha et al. 2020). An organization’s relationships with its stakeholders, including the public sector, and assimilation of social media facilitate organizational learning (Okwechime et al. 2018). This further leads to the absorption of ‘Data Science’ technologies (Bharati and Chaudhury 2019). Investment in ‘Data Science’ not only affects firms’ equilibrium price, market share, and profit but also the rivals’ performances (Wu et al. 2017). With ‘Data Science’ investments, market outcomes also vary based on the competition strategy used, either conservative or expansive. Depending on a fully covered or partially covered market structure, the impact on firms’ competition varies when consumer’s preference of ‘Data Science Strategy’ is heterogeneous.

There are limited studies exploring the factors for adoption of ‘Data Science’ in the start-up environment. These studies bring to light the fact that the key factors affecting adoption are different from established firms (Behl et al. 2019). Different enablers of ‘Data Science’ in start-ups are technical support from the vendors, attitude of the top management, competitive environment, infrastructure, skill enhancement, data access, quality of data, and perceived usefulness. Higher performance can be derived from firms when the Strategic Factor Market (SFM) is imperfect due to the differences in the expectation of the future value of strategic resources. The Information Management Capability (IMC) allows the firm to manage market needs and directions in line with core strategy (Macada et al. 2019).

4.2.5 Alignment with core strategy

‘Digital strategy’ being the superset of ‘Data Science Strategy’ needs to be aligned with the core organizational strategy. Most ‘digital strategies’ fail because they fail to re-imagine the vision of the organization and the journey of digitization. Alignment of digital strategy with core business strategy needs to be accompanied by the way the organization drives its vision (Trompenaars and Woolliams 2016). The digital strategy that an organization should embrace needs to be unique and be difficult enough for the competitors to replicate (Ross et al. 2017). Managerial and operational capabilities are necessary for realizing a digital business strategy (Ukko et al. 2019).

Adopting ‘Data Science Strategy’ in an organization does not mandatorily require building a new strategy. Rather, it requires a coherent alignment with the planned business objectives (Keding 2020). While formulating long-term business strategy, the inclusion of ‘Data Science Strategy’ facilitates business alignment and eventually leads to success (Grover et al. 2018).

Successful deployment of ‘Data Science Strategy’ requires good collaboration and alignment between the IT and business departments. This collaboration and alignment require diverse mechanisms of organizational governance (Dremel et al. 2017). Several stakeholders within and outside the organization mutually benefit either by combining and/or sharing data (Günther et al. 2017). Hence, it is important to align the ‘Data Science Strategy’ of an organization with that of core strategy (Biazzin and Castro-Carvalho 2019).

4.2.6 Managerial willingness

Behavioural intention to use ‘Data Science’ in organizations is determined by outcome expectations, social perception, enabling conditions, and resistance to change. Though ‘Data Science’ is perceived to be difficult (effort expectation), the influence of this perception on ‘Data Science’ adoption intention is small. The ‘Data Science’ adoption intention is contained to the relationship between facilitation conditions on behavioural intention (Cabrera-Sánchez and Villarejo-Ramos 2019). Based on the understanding of ‘Data Science’, the record of its success, and the logical reasoning provided by the technology itself, the managers' willingness to trust ‘Data Science’ varies (Keding 2020). Instead of working in business units in silos, leveraging ‘Data Science’ as a horizontal facilitator leads to an increased awareness at managerial levels. There is a different level of acceptance of ‘Data Science’ based on hierarchical layers in an organization. By customized design of algorithms, the acceptance level can be increased.

Leadership team support is a necessary ingredient in adopting ‘Data Science’. It provides the needed resources by actively promoting, endorsing, and fostering its use. Also, this helps manage change and remove organizational barriers related to Data Science usage (Lautenbach et al. 2017). An organization’s ‘Data Science’ adoption decision can be greatly influenced by highly motivated top managers inclined towards innovations. This happens because such personnel can only provide strategic direction, authority, and resources (Sun et al. 2020).

4.2.7 Organizational agility

Agility positively mediates the relationship between ‘Data Science’ and firm performance (Rialti et al. 2019). Organizational agility must be developed in order to contribute to the emergence of an overall ‘Data Science’ capability (Mikalef and Gupta 2021). Agility is often connected to an organization's dynamic capabilities (Rialti et al. 2019). For all the business processes including product development, resource allocation, knowledge creation, and/or ‘Data Science’ cycle, there is an association of organizational dynamic capabilities (Božič and Dimovski, 2020; Braganza et al. 2017). In a highly dynamic business environment, adopting ‘Data Science’ creates competitive advantage (Sun et al. 2020).

Work practices need to be realigned continuously to gain from ‘Data Science’ (Günther et al. 2017). At the work-practice level, organizations work with ‘Data Science’ in inductive (bottom-up) and deductive (top-down) approaches. For ‘Data Science’ success, both these approaches need to intertwine and complement each other. The intensity of organizational learning is one of the critical intangible resources needed to build ‘Data Science’ capability (Gupta and George 2016; Mikalef et al. 2020b).

4.2.8 Leadership and culture

Cultural differences are a prime aspect in firms' managers' willingness for and resistance to ‘Data Science’ adoption (Keding 2020). Leadership with a clear ‘Data Science Strategy’ and vision can articulate the business cases that are likely to succeed (Grover et al. 2018).

In capitalizing on ‘Data Science’ capabilities, firms need power shift in the organization structure. The new teams of data and analytics need support from top management teams and also an inclusive data-based decision process (GalbRaith 2014). Behavioural change too is required in order to make data-driven decision and stimulating organizational readiness for capability restructuring is the need of the organizations (de Medeiros et al. 2020).

Along with increasing proficiency in sustainable design, 'Data Science' capability also directly enhances sustainable growth and performance (Zhang et al. 2020). ‘Data Science’ increases sustainable innovativeness in organizations as a measure of successful new product development (Song et al. 2020). For sustainable development, assessing the readiness of organizations to adopt 'Data Science' on the temporal dimension is important (Olszak and Mach-Król 2018).

As an antecedent of agility, ambidexterity improves organizations’ ability to respond effectively to market changes (Rialti et al. 2019). Ambidextrous organizational culture acts as a mediator between ‘Data Science’ capabilities and firm performance. Exploration orientation of a firm has a positive effect on ‘Data Science’ leading to long-term results such as generation of new product revenues (Johnson et al. 2017). Organizations can choose strategic pathways based on an assessment of their technical (exploiting suppliers) and analytical (exploring data) capabilities (Najjar and Kettinger 2013). ‘Data Science’ capability has a positive and significant effect on innovative capabilities, both incremental and radical. Moderation effect by information governance is significant on radical innovation than on incremental innovation (Mikalef et al. 2020a). Increased exploration and exploitation capabilities of organizations may be considered related to implementing a 'Data Science capable’ business process management system within ambidextrous organizations (Rialti et al. 2018). Firms can realize organizational creativity and performance by fostering ‘Data Science’ (Mikalef and Gupta 2021). Organizational ambidexterity, which is data-driven with real-time responsiveness, increases firms’ ability toward new market opportunities.

4.2.9 Business value

‘Data Science’ has the potential to provide companies with high business value in terms of financial/market performance or customer satisfaction (Raguseo and Vitari 2018). The outcome of the 'Data Science' project lies in realizing business value, which could be transactional, strategic, and/or transformational (Ji-fan Ren et al. 2017). In enhancing business value, system quality and information quality are key factors (Ji-fan Ren et al. 2017). The strategic value of ‘Data Science’ could be functional (tangible: financial performance, market performance) and/or symbolic (intangible: reputation, brand image) (Grover et al. 2018). Based on industry characteristics and business goals, organizations need to have measures of Key Performance Indicators (KPIs) and RoIs.

The KPIs typically would involve data governance, decision quality, process efficiency, innovation contributions, business expansion, and stakeholder satisfaction (Chakravorty 2020; Grover et al. 2018). It is equally important to track the intermediate indicators, such as stakeholder sentiment and engagement, along with co-creation and value sharing (Libert et al. 2016). RoI refers to revenue, profitability, return on assets, share value, reduced costs, design-cycle optimization, symbolic value (image, reputation, first-mover), forecasting, and prediction (Grover et al. 2018; Shim et al. 2015). Early adopters of’Data Science', focused on the execution of projects as they knew that they had value to offer (Shim et al. 2015). With the integration of 'Data Science' into the mainstream, organizations need to devise the mechanism to substantiate the investments. To derive competitive advantage and add business value to organizations, ‘Data Science’ needs to be an integral part of the entire digital transformation journey and each organization needs a distinct ‘Data Science Strategy’ (DSS).

4.3 Conceptual framework of ‘Enablers and Barriers’ of successful ‘Data Science Strategy’

The SLR summarizes the status of research in the field of ‘Data Science Strategy’. Considering the fact that there is little effort so far in providing an eagle’s eye view on research in ‘Data Science’ encompassing the domains of ‘Big Data’, ‘Artificial Intelligence’, ‘BI&A’, and relevant technologies, this effort is a considerable leap in synthesizing all the enablers and barriers of ‘Data Science Strategy’ adoption in organizations. We propose a conceptual framework for developing and implementing successful ‘Data Science Strategy’ for enterprises (Refer Fig. 7).

Fig. 7
figure 7

Conceptual framework for enterprise ‘Data Science Strategy’

The synthesized ‘Data Science’ components under the themes 'Content', 'Context', 'Intent' and 'Outcome' led to the development of a conceptual framework influenced by the 4I model by Crossan et al. (1999). The 4Is: Intuition, Interpretation, Integration and Institutionalization, take place across individual and organizational levels. Okwechime et al. (2018) have made use of the 4I model in deploying and integrating ‘Big Data’ and ‘Smart Cities’ from the organizational learning perspective. Mandlik and Kadirov (2018) studied the ‘Big Data’ ecosystem at the micro-level (Individual behavior), meso-level (Network of firms), and the macro-level (Institutional, Socio-political). The conceptual framework proposed for ‘Data Science’ success, connects the construct ‘Resources’ as the business input leading to ‘Strategic Business Value’ as the outcome. This transformation of input to outcome is driven by the enterprise ‘Data Science Strategy’. The strategy rests on the balancing of two competing constituents of the same continuum, which are ‘Barriers’ and ‘Enablers’ of ‘Data Science’ success. Drawn upon the Stimulus-Organism-Response (S-O-R) model (Mehrabian and Russell 1974), Contingency model (Fiedler 1964), and Institutional theory (Meyer and Rowan 1977), the constituents ‘Barriers’ and ‘Enablers’ are grouped under the themes Individual, Organizational and Institutional. The S-O-R model describes the influence of factors affecting individual, organizational and institutional behaviors that translate inputs in to business outcomes. The contingency model is appropriate in defining the optimal approach to balance the barriers and enablers of ‘Data Science Strategy’. The reference to the institutional theory is appropriate in emphasizing the role of institutional setup beyond the limited individual and organizational functional considerations.

At the individual-level, the tendency of resistance to ‘Data Science’ project execution either by an employee or by a mid-level manager is either due to fear of ‘failure’ or ‘loss of control’ or ‘operational disruption’ (Mikalef et al. 2020c; Shahbaz et al. 2019). The enablers in the form of developing dynamic capability, such as experience in ‘dealing with complexity’, ‘high tolerance for complexity’ (Gong and Janssen 2021; Walker and Brown 2019) and ‘Top-management-Team’ support (Alaskar et al. 2020; Behl et al. 2019; Chaurasia and Verma 2020; Foshay et al. 2015; Halaweh and Massry 2015; Lai et al. 2018; Lamba and Singh 2018; Lautenbach et al. 2017; Popovič et al. 2018; Ransbotham et al. 2017; Verma and Bhattacharyya 2017; Walker and Brown 2019; Wang et al. 2018c) are a must to address the barriers to considerable extent. Organizational environment for an individual in communicating the benefits of ‘Data Science’ (Chakravorty 2020; Gong and Janssen 2021; Verma 2017) is also a barrier for ‘Data Science’ project success. Creating opportunities to interact with leadership team and adopting a deliberate storytelling technique (Boldosova 2019) would be helpful in overcoming the communication gap barriers. There is a significant effect of communicating to internal teams with deliberate storytelling, and reinforcing the strategic initiatives. This in turn influences employee behaviour towards ‘Data Science’ initiatives.

At organization-level barriers for ‘Data Science’ success range from data-related challenges to huge investment requirements to internal politics. Establishing ‘Data Science’ projects require huge investments on skills and infrastructure (Behl et al. 2019; De Luca et al. 2020; Holland et al. 2020; Lee et al. 2017; Wu et al. 2017), which is quite a big challenge for most organizations. Though it may not be avoided completely, infrastructure flexibility in identifying and using of compatible and complementary resources (Alaskar et al. 2020; Chaurasia and Verma 2020; Mikalef and Gupta 2021; Moreno et al. 2019; Shokouhyar et al. 2020; Verma and Bhattacharyya 2017; Walker and Brown 2019) already existing in the organization can considerably reduce the burden on new investments. Lack of skills and knowledge (Ahmed et al. 2018; Behl et al. 2019; Dubey et al. 2019b; Lamba and Singh 2018; Foshay et al. 2015; GalbRaith 2014; Mikalef et al. 2020a, 2019b, 2020c; Rialti et al. 2019) required to execute the ‘Data Science’ projects can be addressed by setting up ‘Training & Knowledge Management’ capabilities and processes (Calvard 2016; Dam et al. 2019; Ferraris et al. 2019; Harlow 2018; Rialti et al. 2020). Acquiring and retaining the right talent in the field of ‘Data Science’ is another challenge (Holland et al. 2020; Ransbotham et al. 2017) and needs to be addressed by rewarding and recognizing (GalbRaith 2014) the contributions in regular and timely manner.

The divergent actions towards ‘Data Science’ projects by different business units, including IT, in the same organization due to working in silos, and non-alignment (Calvard 2016; Foshay et al. 2015; Walker and Brown 2019) to business objectives are barriers for success. “Data Science Strategy’ must be aligned (Biazzin and Castro-Carvalho 2019; Comuzzi and Patel 2016; GalbRaith 2014; Ross et al. 2017; Mithas et al. 2013; Moreno et al. 2019; Trompenaars and Woolliams 2016; Ukko et al. 2019; Zaki, 2019) with ‘Digital Business Strategy’ which, in turn, should be aligned with ‘Organizational Strategy’. Further, there has to be seamless collaboration across the organization (Kache and Seuring 2017) to enable desired outcomes. Organizational inertness (Biazzin and Castro-Carvalho 2019), resistance to organizational flexibility (Ahmed et al. 2018; Cabrera-Sánchez and Villarejo-Ramos 2019; Dubey et al. 2019a; Mikalef et al. 2019a, 2020c; Wang et al. 2018b) and denial (no support) to experimentation (Alaskar et al. 2020; Mikalef et al. 2020c; Walker and Brown 2019) are other sets of organizational barriers. These challenges need to be addressed with a cultural shift within the enterprise. Organizations developing ability to absorb paradigm shifts (Walker and Brown 2019), by developing dynamic capabilities, such as dealing with complexity, facilitating reuse, enabling interoperability, client orientation, creating flexibility, adherence to privacy, facilitating communication, impact evaluation, decision-making support, and migration strategy (Gong and Janssen 2021) can overcome organizational inertness. Organizational agility (Dam et al. 2019; Verhoef et al. 2021) in learning and developing ambidextrous culture in exploring and exploiting the new technological advancements with strong data-driven analytics culture will help overcome the resistance to flexibility and lack of experimentation. The intensity of organizational learning is one of the critical intangible resources needed to build ‘Data Science’ capabilities. Barriers in the form of ‘data challenges’ (Brinch et al. 2018; Chakravorty 2020; Devasia 2018; Fakhri et al. 2020; Halaweh and Massry 2015; Lamba and Singh 2018; Lautenbach et al. 2017; Mikalef et al. 2020a; Shahbaz et al. 2019; Saldžiūnas and Skyrius 2017; Sivarajah et al. 2017; Zaki 2019), such as collection, extraction, relevance, refinement, handling, ownership, documentation, communication, management, quality, trust, security, and privacy must be dealt with utmost care and by making deliberate use of characteristics of data (Behl et al. 2019; Choi et al. 2018; Ranjan 2019; Lacam 2020; Rialti et al. 2018; Sarker et al. 2019; Schroeder 2016; Wright et al. 2019; Yadav 2017). A proper data and information governance model (Bertot et al. 2014; Chakravorty 2020; Fakhri et al. 2020; Foshay et al. 2015; Gregory 2011; Mikalef et al. 2020a; Wang et al. 2018a) would help overcome these challenges to a significant effect. The intra-firm power dynamics (Jha et al. 2020; Schroeder 2016) also play a significant role in the success of ‘Data Science’ projects. The adverse effects must be handled in organizations by encouraging formal and informal network creation amongst the management team and ‘Data Science’ teams.

Barriers at the institutional-level are equally important to be addressed as much as individual and organizational-level in the success of ‘Data Science’ projects. Institutional pressures namely coercive, normative, and mimetic (Dubey et al. 2019b) can act both as barriers and enablers based on the context and industry characteristics. Awareness of the institutional pressures help in the selection of tangible (infrastructure), intangible (culture), and human resources (‘Big Data’ skills). The competing business models with lack of transparency require compliance requirements (Lautenbach et al. 2017; Mikalef et al. 2020c) to be established by industry-level collaborations along with a clear ‘target market’ definition (Holland et al. 2020). Creating effective institutional arrangement in facilitating innovative technologies enables organizations to benefit from ‘Data Science’ (Sang et al. 2020). External market factors (Lautenbach et al. 2017) adversely affecting the ‘Data Science’ projects need to be dealt by industry-government engagement by formulating a national-level strategy for the ‘Data Science’ domain. Information Management Capability (IMC) of an organization plays an important role in effectively handling external market factors (Macada et al, 2019). Lack of competency and support by vendors and alignment between client and service provider (Behl et al. 2019; Walker and Brown 2019) needs institutional arrangement in addressing the issue. There is also need to create a competence pool in the domain of ‘Data Science’. Dearth of adequate talent in the core domains, along with ‘Data Science’ skillsets (Akter et al. 2016; Chatterjee 2020; Devasia 2018; Halaweh and Massry 2015; Jha et al. 2020; Kache and Seuring 2017; Lautenbach et al. 2017; Mikalef et al. 2019a; Ransbotham et al. 2017; Wang et al. 2018c), is another challenge that organizations face in implementing ‘Data Science’. Collaboration between industry and academia on ‘Data Science’ topics would lead to the development of skilled talent during university education. This would help to address the above-mentioned institutional concerns to a considerable extent.

The proposed conceptual framework was validated and refined through an iterative feedback process based on expert opinions. The conceptual model was shared with ten different experts leading ‘Data Science’ initiatives in their organizations at senior executive levels. These organizations work in the Retail, Media, Healthcare, Fintech, Manufacturing, Automobile, Aerospace, and Telecommunications sectors. Their opinions were given due consideration and the model was appropriately modified.

5 Discussion

In adopting ‘Data Science’, organizational approaches can be either deductive (top-down) or inductive (bottom-up). The traditional approach is the top-down approach where the business problem is defined first followed by a search for the required data. With this approach, there could be challenges with the accessibility of data and a feeble understanding of data collection methods and processes. Unless there exists high data analysis competency in the organization, this approach could be risky. With limited exposure to ‘Data Science’, a bottom-up approach is suggested where available data is first analyzed, and further, the business problem is accordingly defined based. This approach would have the following steps: (i) control of data hygiene, (ii) staff training, (iii) data value estimation, (iv) development of new dimensions for data sets, (v) evaluation of data analysis consequences, and (vi) use of insights (Saldžiūnas and Skyrius 2017). On the contrary, Lautenbach et al. (2017) suggest organizations drive BI&A usage with the top-down approach. The leadership team must keep the employees informed of the value and benefits derived through BI&A use (Lautenbach et al. 2017).

‘Data Science Strategy’ by its very nature leads to asymmetric information in the technology market (Mandlik and Kadirov 2018). In equity markets, information on firms’ investment in ‘Data Science’ leads to positive stock market reactions. Moreover, investments in small vendors tend to bring in higher returns than big vendors. Also, investors assess ‘Data Science’ investments of big firms as compared to those of small firms (Lee et al. 2017). There is a different impact on the size of returns depending on firm size and vendor size. Reliability, ROI, real-time analytics must be kept in mind while dealing with ‘Data Science’ adoption from a business perspective (Yadav 2017).

This SLR provides an eagle’s eye view of enablers and challenges of adopting ‘Data Science’ in organizations. In harnessing the power of ‘Data Science’, policies are required that are collaborative and complementary across organizations in creating a stakeholder marketplace, facilitating the generation of annotated data sets, spreading awareness relating to the contribution of AI, and supporting the start-ups (Chatterjee 2020).

5.1 Theoretical contributions

The article makes quite a significant contribution to theory in the field of ‘Data Science Strategy’.

  • This SLR recognizes that prior studies have focused upon different aspects of ‘Data Science’ adoption in organizations predominantly focusing upon ‘Big Data Analytics’ and ‘Artificial Intelligence’. The dearth of studies on ‘Data Science’ as an umbrella term that cuts across different technologies, such as Big Data, Data Analytics, AI, ML, Deep Learning (DL), Visualization, and Business Intelligence (BI) is evident. This SLR has reviewed the most relevant articles and synthesized the knowledge on ‘Data Science’ strategies. The findings and synthesized components are further developed into a conceptual framework. Developing this conceptual framework is the main theoretical contribution of this article. The findings, therefore, contribute to answering the research question ‘What are the enablers and challenges that affect the strategic implementation and success of ‘Data Science’ (AI/ML/BD) in organizations?’ and fulfil a significant gap in the literature.

  • The existing literature on ‘Big Data Analytics’ and ‘Artificial Intelligence’ draws upon strong theoretical backgrounds (Refer Table S2). ‘Data Science’ adoption in organizations has extensive opportunity in generating novel theory, along with new management practices. This article contributes to the theories by drawing upon Stimulus-Organism-Response (S–O-R) model, Contingency model, and Institutional theory in formulation of the proposed conceptual framework.

  • The proposed framework sets the agenda for further research in ‘Data Science’ strategy. This SLR article could serve as reference for practitioners as well as research scholars. Through the proposed framework, literature can be further enriched by developing the constructs grouped under ‘barriers’ and ‘enablers’, and developing them into propositions and hypotheses. Research contributions towards validating these propositions and/or hypotheses using appropriate qualitative and quantitative techniques can add to the knowledge body.

5.2 Managerial implications

Managers need to note that ‘Data Science’ is a part of the wider ecosystem including ‘Big Data’, ‘Data Analytics’, ‘Machine Learning’, ‘Artificial Intelligence’, and ‘Business Intelligence’. It also provides significant benefits over traditional analytics (Business Intelligence) systems. The considerable effect of ‘Data Science’ on the workforce is quite eminent in the next few years both at the organizational and the personal levels.

The proposed framework of ‘Data Science Strategy’ can help organizations from various industries and domains in taking the right steps towards the successful implementation of ‘Data Science’ projects. We summarize the implications to managers below.

  • Every enterprise’s ‘Data Science’ journey should start with the right business questions. Be sure of a clear business strategy and an expected value that the ‘Data Science Strategy’ can relate to. In answering the business questions pertaining to ‘Data Science’, choose the strategic choices necessary to drive the ‘Data Science’ transformation forward with resources, such as Data, Technology, Infrastructure and Human talent (skills and knowledge). Without business-appropriate choice of these resources, the desired outcome would hardly be achieved.

  • For ‘Data Science’ project to successfully lead to business outcomes, managers need to be cognizant of the barriers. Though it may not be possible to completely eliminate them, they can be minimized. The proposed framework summarizes the comprehensive list of barriers that managers need to keep their focus upon. Prior analyses of these barriers much before implementation of ‘Data Science’, would help in the planning and distribution of resources. The barriers summarized in the framework are a result of our review on multiple industry sectors and domains. We believe many organizations would almost face all or few of these barriers. Depending on the specific industry and domain, specifically applicable barriers should be considered by the managers (also refer to Supplementary Table S1). In summary, good overview of barriers at the early stages of planning and implementation of ‘Data Science’ projects will help managers to overcome hindrances and realize expected outcomes.

  • Along with barriers to ‘Data Science’, the proposed framework also summarizes the comprehensive list of enablers which intend to minimize the barriers. Enablers and barriers of ‘Data Science’ projects are inversely related. The summary of enablers for various industry sectors and domains is presented in Supplementary Table S2. Detailed prior research on all the enablers and specific attention to the ones relevant to their industry would help them to optimize the resources and accelerate the ‘Data Science’ journey to realizing ‘business value’.

  • The outcomes of ‘Data Science’ projects can be either functional or symbolic. Organizations must focus upon clearly targeted outcomes which can be measured using process KPIs and business value in terms of profits and cost savings (RoI). The value in the form of competitive advantage and structural transformation must also be monitored as indicated in the framework. Managers must also recognize the importance of ‘Data Science’ adoption in the form of symbolic value add. As a symbolic business value, ‘Data Science’ adds to the organizations’ brand value, and recognition as technology-driver. ‘Data Science’ projects can also enhance and lead social, environmental and corporate friendly initiatives within the organization. Managers must realize that the business outcomes in most ‘Data Science’ project initiatives are iterative and prone to initial hiccups. Awareness and balance of barriers and enablers from the proposed framework will help to realize the desired results in a stipulated time span. Therefore, early-stage setbacks on ‘Data Science’ projects should not deter the purpose of successful utilization of ‘Data Science’. Despite the mentioned barriers, managers in organizations need to keep up their motivations and convince the stakeholders (sub-ordinates, partners, vendors and management teams), and count on the ‘Data Science’ enablers.

Knowledge acquired from the learnings of the proposed framework in the form of resources, barriers, enablers, and strategic business value would help organizations in adopting ‘Data Science’ and exploit new business opportunities.

6 Limitations and future research

We acknowledge the limitations of our study and the readers should interpret the content of this SLR article in the context of these limitations. Primarily, the SLR has been reliant on accessible research articles from four databases, namely, EBSCO Business Source Ultimate, Scopus, Science Direct, and Pro-Quest databases, in English language. Though we have conducted SLR by identifying all possible articles relevant, we could have missed some from other leading databases (Web of Science). In addition, the interpretation of findings and synthesis are subjective, a fact which we (authors) have attempted to overcome by examining the articles independently. Secondly, amongst the mentioned databases used, the articles are identified based on ‘keyword search’ in the Title, Abstract, and Article keywords sections. We could have failed to consider the relevant articles where the subject area of interest is embedded in the main text. Thirdly, the proposed conceptual framework needs to be developed into an hypotheses and validated with empirical studies.

In addressing the first and second limitations mentioned, it is recommended further to conduct literature reviews from other databases including keyword search throughout the article (not limiting only to Title, Abstract, and Article keywords). To address the third limitation mentioned above, the proposed framework could help future researchers push significant agenda for ‘Data Science Strategy’ research.

We have identified the following future research opportunities:

  • Developing constructs for each of the constituents in the framework, formulating and empirically validating hypotheses.

  • As highlighted in motivation for this SLR, the literature articles have focused highly on certain individual components of ‘Data Science Strategy’ in discrete (viz. Big Data Strategy, Big Data Analytics Strategy, Artificial Intelligence Strategy, and BI&A Strategy). Considerably less research attention is paid to other individual components of ‘Data Science strategies’ including data visualization strategies, Machine Learning and Deep Learning strategies in enterprises, and ‘Data Science’ strategies with ‘Small Data’. Data visualization is an important component in persuading the top management stakeholders. Future studies could focus upon examining the ‘Data Science’ strategies in these neglected areas.

  • A large number of ‘Data Science Strategy’ studies have been dedicated to ‘Big Data Analytics’. Apart from ‘Big Data Analytics’, there also exists relevance for ‘Small Data’ analytics. For many small and medium businesses, ‘Small Data’ also provides great business insights. There is dearth of studies on enablers, barriers and impact of ‘Small Data’ implementation and strategic constituents. Future researchers can focus upon this direction.

  • Extant literature predominantly examines the enablers and barriers of ‘Data Science’ adoption in certain industry sectors and business domains (viz. Healthcare, Manufacturing, Construction, E-commerce, Telecommunications, Banking and Finance, Automotive, Aerospace, and Entertainment) (Ahmed et al. 2018; Dremel et al. 2017; Holland et al. 2020; Lee et al. 2017; Saldžiūnas and Skyrius 2017; Yang et al. 2015; Wang and Hajli 2017; Wamba et al. 2017). However, many sectors are yet to be explored (viz. Offline and Online Education, Disaster management) in this context. Researchers could pay attention to these unexplored areas.

  • Sector-wise comparative analysis between enablers and barriers of ‘Data Science’ Strategy between enterprises has also not been attempted in extant literature. Studies in such direction can bring in key insights which can be used for peer-to-peer learning among industries. Owing to the current Covid-19 pandemic situation, organizations are speeding up their digital transformation journeys. As a result, they are capturing large amounts of data which is unprecedented and new AI use cases are also evolving. The impact of Covid-19 pandemic on the overall ‘Data Science’ ecosystem including strategic dimensions needs investigation.

  • ‘Data Science Strategy’ studies have largely focused on industry sectors in countries such as the USA, India, China and Europe (Refer to Fig. S4). Except India and China, the industries in other Asia–Pacific regions have not been examined extensively. Future studies can be devoted to this aspect. Additionally, market-specific condition combination (developed & developing economies) along with organizational culture (explorative and exploitative, flexibility) can further be explored. The adoption of ‘Data Science’ in internationalization is still an emerging research interest. The organizational variables affecting the relationship between ‘Data Science’ and internationalization could be of interest to future researchers.

  • With many start-up companies mushrooming, studies exploring the factors associated with ‘Data Science’ adoption and success, and contrasting them with that of established firms is an area can be researched. Further research should be conducted to understand the role of ‘Data Science’ in digital transformation.

  • There may be scenarios, wherein the Human Resource Management (HRM) dimension has led to a failure in the ‘Data Science’ implementation. This dimension has not been studied extensively. Studies can be conducted specifically on the HRM dimension of the ‘Data Science’ implementation.

  • Consideration towards growing diversity of consumer preferences ethically is strategically relevant to deploy the ‘Data Science’ business models (Wiener et al. 2020). On the contrary, Biazzin and Castro-Carvalho (2019) studied the impact of buyers’ current behaviour and found it does not significantly impact their deployment. Societal impacts of different ‘Data Science’ strategies should involve the proper implementation of rules and policies to ensure ethical use. This could also be direction for future research.

  • Studies are also limited on the implementation of ‘Data Science Strategy’ in government sector organizations. As an example, the citizens’ data collected by Government of India through ‘Aadhar’ unique identification number, ‘Aarogya Sethu’, and ‘CoWIN’ can be used for building social welfare and health infrastructure development programs. The benefits and the intertwined challenges with ‘Data Science Strategy’ in the government sector offer further opportunities for research.

7 Conclusion

The SLR aimed at synthesizing the enablers and barriers of 'Data Science Strategy' adoption in different industries, sectors, and business domains. The review conducted an in-depth study of 113 published articles (1998- 2021) sourced from four different databases, namely, EBSCO Business Source Ultimate, Scopus, Science Direct, and Pro-Quest databases, respectively. The findings show that the enablers of ‘Data Science’ and barriers for adoption of ‘Data Science Strategy’ are synthesized. The SLR identified four themes namely ‘Content’ (Data Quality & Governance), ‘Context’ (Technology and Environmental), ‘Intent’ (Culture, Alignment, Agility), and ‘Outcomes’ (Functional and Symbolic business value) that influence the development of ‘Data Science Strategy’ in organizations.

The key lessons learnt are formulated in to a conceptual framework linking ‘Data Science’ resources to business outcomes, influenced by ‘barriers’ and ‘enablers’. In summary, the barriers and enablers are segregated under individual, organizational and institutional contexts. Organizations may not completely be able to eliminate the ‘barriers’, but capitalizing on ‘enablers’ would mitigate the risks of ‘Data Science’ adoption and success of projects leading to desired business outcomes. The study also summarizes the ‘barriers’ and ‘enablers’ of ‘Data Science Strategy’ in organizations depending on industry sectors and domains. Although most of the barriers and enablers might be common across businesses, organizations should prioritize those relevant to their areas of expertise.

The study also emphasizes on the significant role of 'Data Science Strategy' in organizations augmenting 'Data Strategy', 'digitalization' and, eventually 'digital transformation' leading to business success. Research gaps are identified for future attention. The academic and practitioner community should focus their efforts on promoting the development and implementation of a well-crafted, outcome-based, holistic 'Data Science Strategy' so as to reap the benefits of the digital revolution that’s happening globally. Additionally, the onset of Covid-19 pandemic has accentuated the challenges faced by enterprises across various domains which further pushes for a need for a ‘Data Science Strategy’ to aid in the development of a ‘Digital Strategy’ for an organization.