Introduction

The amalgamation of various technologies like sensor communications, cloud computing, Internet of Things (IoT), artificial intelligence, machine and deep learning plays a vital role in the smart world [1]. IoT is a prevailing technology capable of morphing human lives by providing ease and smartness in varied conventional application domains. As shown in Figs. 12, and 3, IoT is a hybrid environment that is a combination of many technologies such as sensing, data storage, data analytics, and connectivity of things. Further, IoT extends the capabilities of the physical things [2].

IoT applications like smart city, smart healthcare systems, smart building, smart transport and smart environment [3], industrial, agriculture, supply chain management [4], smart retail, location-based services, etc. may deal with sensitive data such as health information, financial information [5], location footprints, Personally Identifiable Information (PII) [6], data of personal life, etc. Data deluge from billions of entities producing information is a significant threat to privacy [7] (Fig.  4).

Fig. 1
figure 1

Introduction to the Internet of Things and applications

Fig. 2
figure 2

Major components of Internet of Things

Privacy is the right of individuals, which helps them keep their information secret and have control over their information [8]. Privacy preservation is an important aspect that must be considered in every existing logical and physical system to reduce the possibilities of privacy breaches. Ensuring Information privacy is an increasing concern for government, business, consumer, and likewise [9]. In IoT-based networks, personal information is collected from smart devices, and weak privacy measures can misuse sensitive information. If this personal information is stolen, then results can be detrimental [10].

Some of the significant privacy challenges in IoT are as follows:

  1. (1)

    What private data are sensed, where is this data stored, how and who uses the data? [10]

  2. (2)

    Automate the process of identification of sensitive and non-sensitive data.

  3. (3)

    How to allow users to control and manage their data, maintain user’s anonymity, and preserve the data integrity in each phase of the data’s life cycle? [5]

  4. (4)

    Implementation of efficient mechanism that is suitable for pervasive infrastructure and resource-constrained IoT devices [11].

Fig. 3
figure 3

A typical architecture the IoT environment

Fig. 4
figure 4

Application domains and research challenges of IoT

Many researchers have emphasized that privacy and security are the most challenging problems in IoT because of the risk associated with leakage of the user’s private information from several IoT services [12]. Data protection by design and by default (or privacy by design) is crucial to address privacy and protection of data [13]. Users will accept IoT-based systems only if they are secure, trustworthy, and privacy is preserved [8]. Users must be equipped with tools to retain their anonymity in an IoT-based connected world [7]. Thereby, in an IoT environment, an efficient and well-planned strategy is necessary to preserve privacy. The novelties and contributions of this paper as follows:

  1. (1)

    A Multilevel Noise Mechanism has been proposed for data collection to ensure privacy preservation in the Internet of Things environment.

  2. (2)

    A user preferences-based data classifier has been proposed to classify sensitive and non-sensitive data in the Internet of Things environment.

  3. (3)

    Noise Removal and Fuzzification Mechanism has been proposed for data access to ensure privacy preservation in the Internet of Things environment.

The remainder of this paper is organized as follows: “Related work and motivation” describes related work and motivation. “Adversary model and design objectives” presents adversary model and design objectives. The noise-based privacy-preserving model is described in “Noise Based Privacy Preserving model”. The experiments and results are given in “Experiments and results”, and “Limitations and future scope” concludes the paper.

Related work and motivation

The consumer’s trust can be enhanced by privacy preservation in IoT, and it can be achieved by fulfilling the privacy requirements at data generation, storage, usage, and sharing [10]. Ziegeldorf et al. [14] analyzed the privacy issues, discussed the evolving features and trends in IoT, and classified privacy threats. According to the survey [15], more research needs to be done to ensure security and privacy for the IoT paradigm’s success. With the miniature power sources, small memory, limited processing capability, and incredibly resource-constrained IoT devices [16], User privacy and data protection, authentication and identity management, trust management, policy integration, authorization and access control, end-to-end security, etc. are security and privacy challenges in the IoT that need to be addressed (Tables 1 and 2).

The personal data collection and usage of these data are challenges to individual privacy in the IoT [17]. Corcoran [18] has introduced different privacy classes and outlined some ideas for improved privacy framework for IoT, such as; data should be protected at the data source. For the mitigation of heavy computation constraints due to cryptographic operations in the sensors used in medical applications, Moosavi et al. [19] proposed a Secure and Efficient Authentication and Authorization (SEA) Architecture perform authentication and authorization on behalf of the medical sensors by the distributed smart e-health gateways. SEA architecture is based on the fact that various heavy-weight security protocols and certificate validation efficiently can be handled by smart e-health gateway and the remote end-user because both have sufficient resources.

Appavoo et al. [20] proposed a privacy-preserving model to prevent service providers from revealing sensed values, sensor types, and user preferences. The proposed work can be considered as a simple form of functional encryption. A case of a semi-trusted service provider has been considered. In this work, the author represented privacy loss (Eq. 1) in the form of mutual information [21].

$$\begin{aligned} I(\Psi , V; \delta ) = H(S, V) - H(\Psi , V|\delta ), \end{aligned}$$
(1)

\(\Psi \), V, and \(\delta \) are the random variables for the set of sensors that can be utilized, the set of sensed values, and the set of outcomes for the trigger conditions, respectively. H (S, V) represents the maximum information that can be predicted for sensors and their values.

Turgut and Boloni [22] have concentrated on the value and cost of data exchange in IoT with the other types of cost. They described an exciting relationship between the value of information and the cost of privacy (customer’s benefit from Eq. 2 and business benefit from Eq. 3) for the IoT paradigm’s success. The definition of the notations used in these equations is given in Table 4.

$$\begin{aligned}&\eta _{\text {service}} - \sigma _{\text {privacy}} - \sigma _{\text {hardware}}^{\text {user}} - \sigma _{\text {payment}} > 0 ,\end{aligned}$$
(2)
$$\begin{aligned}&\rho _{\text {information}} - \sigma _{\text {hardware}}^{\text {business}} + \sigma _{\text {payment}} > 0. \end{aligned}$$
(3)

As a notion that trust can be directly related to privacy [23], Butun [24] mapped privacy and trust relation by integrating multi-dimensional relationship of the sensitivity level of PII items, privacy, and trust (Eq. 4).

$$\begin{aligned} \Gamma (\phi ; \varepsilon , \Omega , \pi ) = \frac{1}{1+ e^{-(-\varepsilon (\phi -\pi \Omega ))}}. \end{aligned}$$
(4)

Jayaraman et al. [25] introduced privacy-preserving IoT architecture and data ingestion scheme in which produced IoT data are split into R parts, where R is the number of servers. If a \(j\text {th}\) datum produced by an IoT device is D and the number of servers is three (R=3), then it will be split into data addends, namely \(\alpha _{1j}\), \(\alpha _{2j}\) and \(\alpha _{3j}\), where

$$\begin{aligned} D_j = \sum _{i=1}^{R}\alpha _{ij}. \end{aligned}$$
(5)

Along with privacy-preserving IoT architecture, Jayaraman et al. also proposed a privacy pre serving data access scheme based on the Paillier cryptosystem’s homomorphic properties (Tables 1, 2).

The Dynamic Privacy Protection (DPP) model [26] is designed to ensure mobile device user privacy. DPP model generates a privacy protection plan to determine the security mode for each data or data package. In this model, privacy protection levels are classified based on privacy weight. Total privacy weight \({\mathbb {P}}\) is calculated using Eq. (6). In this equation, \(N^e(D_i)\) is the number of data or data packages \((D_i)\) that use higher-level security mode, and \(N^n(D_i)\) is the number of data or data packages that use lower-level security mode. If values of binary function s(i) = 1, then encryption will be used and if s(i) = 0 then non-encryption will be used.

$$\begin{aligned} {\mathbb {P}} = \sum _{S_{(i)=1}^{}}N^e(D_i) \times W^e(D_i) + \sum _{S_{(i)=0}^{}}N^n(D_i) \times W^n(D_i).\nonumber \\ \end{aligned}$$
(6)
Table 1 Basic concepts used for privacy preservation in the various existing frameworks/approaches
Table 2 Key parameters, challenges, important findings of the existing studies

Many researchers have tried to address security and privacy issues in the Internet of Things. Several privacy preservation techniques for IoT have been proposed, but to the best of our knowledge, only a little research work has been carried out to ensure end-to-end privacy, i.e., privacy preservation in all the layers in the IoT ecosystem, along with implementation and detailed results analysis. Also, Many proposed privacy-preserving frameworks are based on cryptographic operations. Many of the existing frameworks have not included data classifier mechanisms and user customization-based privacy preservation. Many of the existing work on IoT privacy has not considered the trade-off between privacy and quality-of-service in the practical scenario. This paper has addressed these issues, presents a systematic flow of IoT data, and implements and analyzes the Noise-Based Privacy-Preserving model (NBPPM model). The proposed model’s novelty is that it ensures data privacy with fair efficiency at all the layers (edge layer, middleware, and application layer) of the IoT ecosystem.

Adversary model and design objectives

This section is focused on various privacy threats associated with IoT. In the adversary model, it has been assumed that an adversary is well equipped to monitor communication channels. Any malicious insider at the data storage level (such as a rogue administrator) can access sensitive and non-sensitive data, analyze data and make inferences to gain advantages. An unauthorized user can access sensitive data at the application level, and a service provider can access user data to provide services to the user.

As an example of inference threat in IoT based healthcare application, let us assume a universal set of sensors in IoT is \(X = \{s_1, s_2, s_3, \ldots s_n\}\) where n is number of sensors in the IoT based system and a universal set of location of these sensors is \(L = \{l_1, l_2, l_3, \ldots l_n\}\). A set for data produced by the sensors in set X is \(D = \{d_1, d_2, d_3, \ldots d_n\}\). If a set of different m kinds of diseases is \(Y = \{y_1, y_2, y_3, \ldots y_m\}\). An adversary well equipped with tools and malicious intention can draw fruitful inferences by employing following inference rules in the inference attack:

$$\begin{aligned} R_1:(d_1 \pm a_1) \wedge (d_2 \pm a_2) \wedge (d_3 \pm a_3) \wedge \ldots \wedge (d_n \pm a_n)\rightarrow & {} y_1 \\ R_2:(d_1 \pm b_1) \wedge (d_2 \pm b_2) \wedge (d_3 \pm b_3) \wedge \ldots \wedge (d_n \pm b_n)\rightarrow & {} y_2 \\ R_3:(d_1 \pm c_1) \wedge (d_2 \pm c_2) \wedge (d_3 \pm c_3) \wedge \ldots \wedge (d_n \pm c_n)\rightarrow & {} y_3 \\&\vdots&\\ R_j:(d_1 \pm k_1) \wedge (d_2 \pm k_2) \wedge (d_3 \pm k_3) \wedge \ldots \wedge (d_n \pm k_n)\rightarrow & {} y_m \end{aligned}$$

where \(a_1, \ldots a_n\), \(b_1,\ldots b_n\), \(c_1, \ldots c_n\) and \(k_1, \ldots k_n\) are constants used to form specific ranges for the derivation of a useful inference rule. For example, through the above inference rules, an eavesdropper can infer patient disease, which may be private information for the patient, and through location set L, linkage-based attack can be performed, i.e., \(\{(d_1, l_1), (d_2, l_2), (d_3, l_3),\ldots (d_n, l_n)\}\). It can result in physical, mental, economic, and social exploitation of the victim.

Security and privacy threats in IoT

An overview of the major security and privacy threats [14, 38,39,40] in the IoT environment is mentioned in Table 3.

Table 3 Overview of various security and privacy threats in the IoT

Problem definition and design objectives

The critical research problem is defined as developing a systematic model to ensure end-to-end privacy against various threats for resource-constrained IoT environments. As the components of IoT such as sensors, actuators, etc. have limited computing capabilities and are not suitable for performing complex computing operations [33], our objective was to plan and develop a model against privacy threats and incorporate privacy preservation characteristics such as to safeguard sensitive information, data access control, query privacy, and user-based privacy customization. Along with privacy preservation, our main objective was to reduce the computational overhead for resource-constrained IoT environments.

Noise based privacy preserving model

This section presents the proposed noise-based privacy-preserving model. The methodology with the structural diagram and detailed functioning of all modules involved in the NBPPM model have been described.

Fig. 5
figure 5

Overview of the core components in IoT for the NBPPM model

Overview

Let us assume a typical IoT environment consists of IoT devices, middleware, data storage, and user devices with apps that consume service providers’ services. The components of the NBPPM model are shown in Fig. 5. Data produced from a source device must be protected in-transit, in-process, and at rest from an intruder that may exist between a source device and a legitimate user device. This goal is achieved in the proposed NBPPM model by incorporating noise while data move from the data source to data storage and denoising the noise at the user device. The proposed model also incorporates the fuzzification mechanism for privacy customization. Thus, the proposed NBPPM uses twofold privacy preservation using noise and fuzzification.

Fig. 6
figure 6

Overall layout of the proposed model

Fig. 7
figure 7

Flowchart of the proposed methodology

Methodology

The proposed NBPPM model’s fundamental modules are the data classification module, multilevel noise treatment module, and noise removal and fuzzification module. In this subsection, each module has been described comprehensively. The overall layout of the proposed model is shown in Fig. 6.

As shown in the proposed methodology’s flowchart (Fig. 7), level 1 noise is added to all types of data (i.e., sensitive and non-sensitive data). After the level 1 noise addition, data splitting is performed on each data. A data classifier synchronized with the user customization setting performs data classification according to the user preferences. If the data attribute is sensitive, then data addends proceed for level 2 and level 3 noise addition. If the data attribute is non-sensitive, then data addends proceed for level 3 noise addition (Algorithm 1). All of these noised data addends are stored in the data repository (i.e., Cloud Storage). An authenticated user can access noised data addends using valid credentials. At the user end, data addends are de-noised using the noise removal process (Algorithm 2). Further, if a service provider requests users’ data to provide services, the service user can supply fuzzified data (based on the user privacy preferences) to the service provider (Fig. 9).

Fig. 8
figure 8

Multilevel noise treatment methodology of the NBPPM model

Table 4 Summary of notations

Data classification module

A data classification mechanism is a necessary step before incorporating a privacy protection mechanism. The data classification mechanism acts as a classifier to categorize data into two classes: sensitive and non-sensitive data class. One of the major issues for data classification is who and how it is decided which data attribute is sensitive and non-sensitive. The data owner is the best entity that can decide the sensitivity of his/her data for an IoT environment. In our proposed data classification mechanism, a data owner can customize his/her data privacy by setting attribute sensitivity to sensitive and non-sensitive mode at the application level, and from the application level, it will be synchronized with the data classifier module. Depending upon the sensitivity of the data, it is treated to multiple levels of noise. Further, at this point, an alternative policy can also be adopted for the data classification by considering an application-specific scenario, i.e., an IoT environment in which some of the data owners cannot judge data sensitivity correctly or may not have any knowledge about the data sensitivity. In this case, a predefined data sensitivity can be added. This predefined data sensitivity can be decided according to specific IoT applications and the General Data Protection Rules and Regulations of the particular country. For instance, in the IoT healthcare system, blood glucose level, heart rate, respiration rate, blood pressure, body temperature can be put in the sensitive category of data, and room temperature and humidity can be considered under the non-sensitive data category. A hybrid policy can also be deployed, combining predefined data classification and user-defined privacy preferences. Therefore, a user can change predefined settings according to his/her personal privacy preferences in the IoT ecosystem.

Multilevel noise treatment

In the multilevel noise treatment module of the NBPPM model, noise acts as a private key for the user. A random number generation algorithm is used to generate and divide noise into sub-noises. Let P be the generated noise; then P will be divided into three sub-noises \(P_1\), \(P_2\), and \(P_3\) through a random number generation algorithm at the user end. Each sub-noise \(P_1\), \(P_2\), and \(P_3\) is privately shared with the Data-Source, middleware, and data storage server, respectively.

figure a

Data splitting and multilevel noise treatment are two critical steps of the NBPPM model, as shown in Fig. 7. Each datum sensed D in the IoT environment is treated with sub-noise \(P_1\) at level 1 from an operator, picked out from the operator table for the sensed data of particular attribute type \(F_i\) (Table 5). Operator selection for level 1 sub-noise is based on modulo operation with the Data Identifier, i.e., from \(Q\text {th}\) position, where N=9 for Table 5. After the treatment of level 1 noise, resultant data is split into three data addends, namely X, Y, and Z. Data classifier module checks data addends X, Y, and Z for sensitivity. If these data addends are part of a sensitive attribute type data, then each of the data addends will be treated with level 2 and level 3 sub-noises. If the data addends are parts of a non-sensitive attribute, then each data addend will pass through level 3 sub-noise treatments only. For instance, as shown in Fig. 7, the sensed data D are treated with noise \(P_1\) at level 1, and then resultant data are split into three data addend, namely \({(X, Y, Z)_{F_i}}\). Then data classifier checks the sensitivity of attribute type \(F_i\). If the \(F_i\) is sensitive attribute type, then \({(X, Y, Z)_{F_i}}\) will be treated with noise \(P_2\) and \(P_3\) resulting into \({(A, B, C)_{F_i}}\) and \({(K, L, M)_{F_i}}\), respectively. If \(F_i\) is non-sensitive, then \({(X, Y, Z)_{F_i}}\) will be treated with noise \(P_3\) resulting into \({(K', L', M')_{F_i}}\). Both \({(K, L, M)_{F_i}}\) and \({(K', L', M')_{F_i}}\) are stored in the long-term storage or the cloud (Fig. 8).

Noise removal and fuzzification

Noise removal at the user device is a reverse mechanism of the Multilevel Noise treatment mechanism. In an IoT environment, the user requests a service from the service provider. In order to provide the service, user data are requested from the service provider. As shown in Fig. 9, the user accesses the requested data from long-term data storage through valid user credentials. In the proposed NBPPM model, the authentication mechanism is incorporated to verify user validity through username and password. A valid user can access noisy data through a secure channel, and then the noise removal process is initiated through sub-keys, which act as the private key for the user. The process of Noise removal and fuzzification is shown in Algorithm 2.

Table 5 An example of an operator Table
Fig. 9
figure 9

Noise removal and fuzzification methodology of the NBPPM model

Privacy is ensured through the fuzzification process when data are transferred between the user and the service provider. A sub-module, termed as privacy manager shown in Fig. 9, plays a vital role in user privacy customization. A fuzzifier sub-module is synchronized with user privacy preferences. A user can set his/her privacy preferences for a particular service, and accordingly, the fuzzifier decides the quality for data to be sent to access a service.

figure b

A comprehensive overview of the functioning of the fuzzifier is as follows. As already defined, a universal set X over sensor domain as \(X = \{s_1, s_2, s_3, \ldots s_n\}\). A user can set the sensitivity level for the data attribute of a sensor node (\(s_i\)) that senses the specific parameter value. Two fuzzy sets \({\tilde{A}}\) and \({\tilde{\lambda }}\) are defined as follows:

$$\begin{aligned} {\tilde{A}} = \text {`Sensitive data' and}\ {\tilde{\lambda }} = \text {`Obfuscation quantity'}. \end{aligned}$$

Membership function of \({\tilde{A}}\) and \({\tilde{\lambda }}\) are \(\mu _{{\tilde{A}}}\) and \(\mu _{{\tilde{\lambda }}}\), respectively, where \(\mu _{{\tilde{A}}}\) \(\in \) [0, 1] and \(\mu _{{\tilde{\lambda }}}\) \(\in \) [0, 1]. Value of the membership function \(\mu _{{\tilde{A}}}\) may be provided through an interface for the user. Value of \(\mu _{{\tilde{A}}}\) indicates the level of the data sensitivity. Value of \(\mu _{{\tilde{\lambda }}}\) indicates about the level of obfuscation. Membership value of the \(\mu _{{\tilde{\lambda }}}\) will be decided through the value of \(\mu _{{\tilde{A}}}\). i.e., \(\mu _{{\tilde{\lambda }}}\) depends on \(\mu _{{\tilde{A}}}\) and an illustrative example of the relationship between \(\mu _{{\tilde{A}}}\) and \(\mu _{{\tilde{\lambda }}}\) may be as follows (Eq. 7 and Table 6):

$$\begin{aligned} \mu _{{\tilde{\lambda }}} = f(x, {\mu _{{\tilde{A}}}}) = {\left\{ \begin{array}{ll} 0 &{} \mu _{{\tilde{A}}}=0 \\ \mu _{{\tilde{A}}} + c_1, &{} 0.1 \le \mu _{{\tilde{A}}} \le 0.4, 0.1 \le c_1 \le 0.4 \\ \mu _{{\tilde{A}}} + c_2, &{} 0.4< \mu _{{\tilde{A}}} < 0.7, 0.1 \le c_2 \le 0.3 \\ 1 &{} \text {otherwise} \end{array}\right. },\nonumber \\ \end{aligned}$$
(7)

where x \(\in \) X and \(c_1\) and \(c_2\) can be fixed within a range and used to add the required quantity of the noise.

Table 6 An example for sensitivity level of data and corresponding level of data obfuscation
Fig. 10
figure 10

Snapshot of the activity recognition dataset

Fig. 11
figure 11

Snapshot of the activity tracker dataset

Experiments and results

The Noise-Based Privacy-Preserving Model has been presented comprehensively in “Noise Based Privacy Preserving Model”. This section presents the experimental setup, findings of the experiment, performance evaluation, security, and privacy analysis to show how privacy can be protected through the proposed model.

Fig. 12
figure 12

Comparative execution time of the proposed model with fuzzifier and without fuzzifer

Fig. 13
figure 13

Comparative analyses of the average execution time of the proposed model without fuzzifier and with fuzzifier

Experimental configurations

The proposed multilevel noise function mechanism, data classification mechanism, and noise removal and fuzzification mechanism are implemented in NetBeans IDE 8.2 [45] for Java. SQLite version 3.21.0 [46] as a backend and SQLiteStudio 3.1.1 [47] is used to manage SQLite database. Proposed mechanisms are executed on the two different types of datasets. The first dataset is Activity Recognition from a Single Chest-Mounted Accelerometer [48] dataset. This dataset is collected from a wearable accelerometer mounted on the chest. Accelerometer data are collected from 15 participants performing 7 activities. The sampling frequency of the accelerometer was 52 Hz. Each record in a file contains a sequential number, x acceleration (attribute \(F_1\)), y acceleration (attribute \(F_2\)), z acceleration (attribute \(F_3\)), and label for activity attributes. The second dataset is collected from the activity tracker, a hand-wearable device, and contains three-axis Accelerometer, Detached PPG Cardio Tachometer, Infrared Wear Sensor. This activity tracker can continuously track Heart Rate, Steps, Distance, and Calories Burned parameters. It is assumed that data collected from a wearable accelerometer mounted on the chest and activity tracker device are sensitive for the user. Different results for various cases are recorded for findings and performance analysis. Initial simulation input parameters for the model are IoT parameter (D), Data Identifier (\(D_{ID}\)), Timestamp (T), Attribute Type (\(F_i\)), Total operators in a row in the operator table (N).

Results and discussion

The execution time is the time to access all the contents of a sample dataset. As shown in Eq. 8, the average execution time is the average of the total time to access the \(N_c\) number of contents of a specific attribute \(F_i\). \((t_j)_{F_i}\) is the execution time to access \(j\text {th}\) content of \(F_i\) attribute type.

$$\begin{aligned} \text {(Average execution time)}_{F_i} = \sum _{j=1}^{N_c} \frac{(t_j)_{F_i}}{N_c}. \end{aligned}$$
(8)

A sample from the activity tracker dataset has been taken and calculated the execution time. Figure 12 shows the comparative execution time of the noise removal without fuzzification and with the fuzzification mechanism in the proposed model. It can be observed from the figure that noise removal with the fuzzification mechanism requires more execution time than noise removal without fuzzification. The sample sizes of 1000–5000 records (data points) have been taken from the Single Chest-Mounted Accelerometer dataset and calculated the average data execution time. The snapshots of the different data are shown in Figs. 10 and  11. Figure 13 presents the comparative average execution time of the noise removal without fuzzification and with the fuzzification mechanism in the proposed model for each data attribute \(F_1\), \(F_2\), and \(F_3\). A sample of the data before and after the noise treatment is shown in Table 7, and a sample of the data without fuzzification and with the fuzzification after the noise removal is shown in Table 8. As shown in Table 8, all the data of a specific attribute type are treated with a fixed amount of noise. It gives a fixed amount of difference with all data of a particular attribute type, but it is not necessary to treat data with the fixed amount of noise. Every data of the particular attribute type may be treated with different random noises, and the resultant varying difference may enhance privacy.

Table 7 A sample of the data before and after the noise treatment
Table 8 A sample of the data without fuzzification and with the fuzzification after the noise removal

Figure 14 presents the findings of the comparative average execution time of the noise removal without fuzzification in the proposed model, and data access control scheme [25] for each data attribute \(F_1\), \(F_2\), and \(F_3\) of the Single Chest-Mounted Accelerometer dataset. Figure 15 presents the findings of the comparative average execution time of the noise removal with fuzzification in the proposed model and data access control scheme [25] for each data attribute \(F_1\), \(F_2\), and \(F_3\) of the Single Chest-Mounted Accelerometer dataset. It is clear from both of these figures that our proposed noise removal mechanism requires less execution time than the data access control scheme [25].

The findings presented in Fig. 16 are the comparative execution time of the noise removal without fuzzification in the proposed model, data access control scheme [25] and data access time of the DPP model [26]. Next, Fig. 17 presents the comparative execution time of the noise removal with fuzzification in the proposed model; data access control scheme [25] and data access time of the DPP model [26].

Algorithmic behavior and performance evaluation

IoT components, such as sensors, actuators, etc., have limited computing capabilities and are not suitable for performing complex computing operations. The comparative analysis shown in Table 9 indicates that the proposed model without fuzzifier has around 52–77% and 46–70% less computational overhead than the data access controls scheme and DPP model, respectively. The proposed model with the fuzzifier has around 48–73% and 31–63% less computational overhead than the data access controls scheme and DPP model, respectively. As the critical research problem to develop a systematic model to ensure end-to-end privacy against various threats for resource-constrained IoT environments and the main objective of the proposed NBPPM model, this analysis of the computational overhead for resource-constrained IoT environments shows the efficiency of the NBPPM model.

$$\begin{aligned} \Gamma (\phi ; \varepsilon , \Omega , \pi , \omega ) = \frac{1}{1+ e^{-(-\varepsilon (\phi -\pi \Omega ))}} * \frac{1}{\omega }. \end{aligned}$$
(9)
Fig. 14
figure 14

Comparative analyses of the average execution time of the proposed model without fuzzifier and data access control scheme [25]

Fig. 15
figure 15

Comparative analyses of the average execution time of the proposed model with fuzzifier and data access control scheme [25]

Fig. 16
figure 16

Comparative analyses of the execution time of the proposed model without fuzzifier, data access control scheme [25] and DPP model [26]

Fig. 17
figure 17

Comparative analyses of the execution time of the proposed model with fuzzifier, data access control scheme [25] and DPP model [26]

Table 9 Comparative analysis of the computational overhead

As an IoT environment deals with a massive amount of data, it is crucial to consider computational time for performance measurement. The integrated multi-dimensional relationship of sensitivity levels of personally identifiable information items, privacy, and trust (Eq. (4)) allowed the author to devise Eq. (9). It is believed that the efficiency of a privacy-preserving algorithm increases as there is a decrement in the computational time (\(\omega \)) of that algorithm, and trust value will increase with the increment in the effectiveness of the privacy-preserving algorithm. In terms of computational time, Noise Removal and Fuzzification Mechanism’s efficacy can be seen in the comparative analysis with the given existing mechanism. In particular, the findings presented show that computational time is less in our noise removal mechanism compare to the encryption-based mechanisms. A privacy customization feature has been incorporated for the user, and comparative analysis with this feature also shows better performance. The experimental results presented in “Results and discussion” validate the feasibility and applicability of the novel NBPPM model for privacy preservation in the real-world and resource constraint environment of the internet of things.

Table 10 An instance of data before and after privacy preservation in NBPPM model

Security and privacy analysis

The proposed NBPPM model ensures security and privacy through Multilevel Noise Treatment and Fuzzification. The privacy of the data is ensured by adding noise. The noise is sub-divided into three sub-keys as described in “Multilevel noise treatment”. Sub-noise \(P_1\), \(P_2\) and \(P_3\) is privately shared with the Data-Source, middleware and data storage server, respectively. The proposed Multilevel Noise Treatment Mechanism stores sensed IoT parameter D as noisy data addends. At the data-source, every sensed parameter is converted into noisy data and then split into meaningless noisy data addends, so it is difficult to know original data without the sub-key \(P_1\) and the operator used to treat the source data with the noise \(P_1\). Further, in the proposed model, a user-customized data classifier is employed to protect sensitive data with a higher level of privacy preservation. At middleware, complexity increases for an eavesdropper to know the original sensed parameter due to the requirement of sub-noise \(P_1\), \(P_2\) and the operators used to treat the source data with the noise \(P_1\) and \(P_2\). At long-term data storage (such as cloud), it is extremely complex due to the requirement of all three sub-noises and operators used to treat the source data with the noise \(P_1\), \(P_2\) and \(P_3\). A comprehensive status of an instance of data at different levels within the NBPPM model, i.e., data before and after privacy preservation, is shown in Table 10.

Table 11 Comparative analysis of different frameworks for privacy preservation in IoT

Furthermore, an attacker could use vulnerabilities such as a weak credential mechanism to gain access to the data. If a user requests data through sending a data request for IoT parameter (D) (containing data field identifier (\(F_i\)), timestamp, and unique username); the authentication mechanism is used in our proposed model to authenticate the user, and thereby the non-legitimate user cannot access the sensitive data. An access control list (ACL) maintains for usernames and their credentials. Even if, at this level, an eavesdropper succeeds in accessing noisy data addend, then privacy will still be preserved since noisy data addends are meaningless. After the successful authentication, only a legitimate user will be able to access noisy data addends. Our proposed Noise Removal and Fuzzification Mechanism also provides flexible and dynamic ways to preserve privacy through the privacy manager module. A user can customize his/her sensitive attributes and level of the sensitivity of their data. Based on the privacy customization, a user-specific privacy preservation environment will be created by the fuzzifier module. A comparative analysis of different frameworks for privacy preservation in IoT is presented in Table 11.

Applicability in real life applications

The proposed NBPPM model can be used in all real-life IoT applications, especially in the application domains where data sensitivity is high. This subsection illustrates a real-life example of the NBPPM model in the IoT-based healthcare system. A typical IoT-based healthcare system involves patient (s), doctors (s), hospital (s), and IoT-based service (s). In this IoT ecosystem, a patient is the user of the IoT-based healthcare system. A patient can be equipped with sensors (that sense the patient’s health parameters), and with a mobile app, a patient can be enabled to use IoT-based healthcare services. Doctor and hospital act as a service provider. A hospital may use third-party services like cloud services to store a massive amount of the produced IoT data. In this scenario, patient’s data are sensitive because of the sensing of health-related parameters. The sensitivity of these health-related parameters may vary from patient to patient, i.e., some patients may want to keep their data private because, for them, sensed health parameters are highly sensitive and for some patients, sensed health parameters are less sensitive. In this situation, the proposed NBPPM model may play a significant role in preserving privacy. An NBPPM model-based IoT healthcare system; preserves user privacy at different levels of the IoT ecosystem, as described in “Security and privacy analysis”.

Limitations and future scope

Several different modifications, experiments, and analyses have been left for the future due to the study’s broad research scope. Future work may focus on in-depth analysis of the particular mechanisms with new proposals to try different enhanced strategies. The following subsection emphasizes the potential future scope for improvement and research directions.

Applicability and scope of the proposed solution with emerging domains

Evolutionary computations, i.e., Genetic Algorithms (GA) based obfuscation mechanism, could be applied in the proposed models. Crossover and Mutation phases can play a significant role in suppressing sensitive information in the IoT ecosystem, and managing the mutation phase to regenerate information can be a challenging step. Still, it will be interesting to develop and analyses the behavior of these kinds of optimization techniques.

Machine Learning has the potential for real-time automation, intelligent processing, and analysis of the high volume of data. The data classification mechanism of the proposed Noise-Based Privacy-Preserving Model can exploit this predictive-power of Machine Learning. This predictive-power may assist in identifying sensitive information in the IoT ecosystem and will reduce human intervention. In the future, Machine Learning-based mechanisms may be incorporated in the proposed model and analyze behavior.

Accountability is an important feature that can enhance every privacy preservation mechanism by rendering control over sensitive personal information. A procedure that keeps the history of all logs (such as a chain of all paths where sensitive data are traveled and the details of the data accessing entity) can be incorporated with the proposed models but will increase computational and space overhead. A future study can be conducted to incorporate this aspect.

There is a tradeoff between Quality of Service (QoS) provided to the user and users’ data consumed by the service provider. In the Noise-Based Privacy-Preserving Model, a sub-module (Privacy Manager) plays a crucial role in customizing user privacy, and privacy is ensured through the fuzzification process when it is transferred between the user and the service provider. Here is the scope to further optimize the membership function for the specific application and case study.

Along with the study’s future scope, the following are emerging domains where proposed solutions can be employed:

Edge computing and fog computing

As edge computing and fog computing, both paradigms move the computational capabilities closer to the data source, and these computing technologies may move data intelligence and data analytics near the IoT ecosystem’s data sources. In future work, such approaches may be adopted in our proposed privacy-preserving model to distribute trust with the enhanced privacy protection in the IoT ecosystem.

It will be interesting to develop and study the architectural integration of edge and fog computing in the privacy-preserving IoT ecosystem, privacy-preserving edge and fog data processing, and management of edge and fog nodes in the frameworks.

Blockchain

An adversary can infer significant information about the users from blockchain-based IoT networks. These systems need specific privacy-protection plans to preserve personal and device privacy. A critical perspective that causes privacy leakage in the blockchain network is address reuse. Public addresses of blockchain users are open to anyone in the network, and an adversary can easily access these addresses through internet access. A perfect anonymous transaction in the blockchain is unlikely without any particular privacy-preservation plan. Also, linking attacks can be performed over distributed ledger that contains a copy of transactions [49]. The proposed model can be applied to preserve privacy in this scenario, and it is further a future scope of the study for these use cases.

Fifth Generation technology

The lately emerging Fifth Generation (5G) technology is expected to transform every area of life by connecting everything, everywhere, by employing IoT devices. However, massively interconnected devices and high-speed data communication will bring the challenge of privacy and energy insufficiency. 5G industries and organizations require privacy-preservation for their endurance and competency. Moreover, billions of devices supposed to communicate using the 5G network will spend a considerable amount of energy while confined energy-resources. Hence, energy-optimization is a future challenge confronted by 5G industries that need to be addressed [50]. In this case, our proposed privacy-preserving model can be integrated with 5G technology, and it will be interesting to study improved privacy with the energy resource optimization in this specific use case.

Autonomous vehicles

The emergence of complex cyber-physical systems (CPS) such as an autonomous vehicle is equipped with different sensors and intelligent logic to provide advanced auxiliary services. Due to their sensor and inboard intelligence, such vehicles gather, analyze, and capitalize upon an unprecedented quantity of fine-grained data and cooperate in real-time with various stakeholders. However, such valuable data can significantly impact data-driven economies of scale, which raises questions concerning privacy and integrity-dependent situations [51]. Our proposed study’s future scope is the measurement of real-time performance with autonomous vehicles and should cover a study of the level of the balance between privacy preservation and quality of service in this specific use case.

Conclusion

The NBPPM model has been presented to address critical issues of privacy preservation in the IoT ecosystem. The proposed model ensures end-to-end privacy preservation in the IoT environment. The NBPPM is a robust and flexible model that ensures privacy preservation according to the user’s preferences. The performance of the proposed NBPPM model has been evaluated in terms of computation overheads. Our experimental results show that the computational cost in NBPPM is reasonably less in the practical scenarios. In this article, the feasibility of the proposed model has been demonstrated for the IoT’s resource-constrained environment. An exciting future work of the NBPPM model may be incorporating accountability procedures at the appropriate levels to enhance control over personal information in the IoT environment. The outcomes of this work may have a significant effect on IoT-based industries.