Abstract

In recent years, it has become popular to upload patients’ medical data to a third-party cloud server (TCS) for storage through medical Internet of things. It can reduce the local maintenance burden of the medical data and importantly improve accuracy in the medical treatment. As remote TCS cannot be fully trusted, medical data should be encrypted before uploading, to protect patients’ privacy. However, encryption makes search capabilities difficult for patients and doctors. To address this issue, Huang et al. recently put forward the notion of Public-key Authenticated Encryption with Keyword Search (PAEKS) against inside keyword guessing attacks. However, the existing PAEKS schemes rely on time-consuming computation of parings. Moreover, some PAEKS schemes still have security issues in a multiuser setting. In this paper, we propose a new and efficient PAEKS scheme, which uses the idea of Diffie-Hellman key agreement to generate a shared secret key between each sender and receiver. The shared key will be used to encrypt keywords by the sender and to generate search trapdoors by the receiver. We prove that our scheme is semantically secure against inside keyword guessing attacks in a multiuser setting, under the oracle Diffie-Hellman assumption. Experimental results demonstrate that our PAEKS scheme is more efficient than that of previous ones, especially in terms of keyword searching time.

1. Introduction

In today’s society, almost all medical service providers will use some form of electronic medical record system [1]. Specifically, medical Internet of things (MIoT) has become a new technology to gather data from patients by small wearable devices or implantable sensors. With the increasing number of medical data, the burden of hospital storage equipment is heavy, and it needs a professional person to maintain. If the hardware storage device is damaged and data is lost due to other force majeure factors, it will lead to very serious consequences. The most important way to solve this problem is to upload the data to the third-party cloud server (TCS). However, after the data is uploaded to the TCS, the patient’s privacy will not be guaranteed. Once the cloud server managers or external malicious attackers steal the data, it will cause data leakage and other problems [2].

In order to solve the problem of data security, the best way is to encrypt the data and then upload the result to TCS. But when medical service providers want to retrieve the electronic medical records of patients, it becomes more difficult. First, doctors need to download all encrypted data to a local server and then decrypt it locally. After that, they can search for the desired results in the plaintext medical data. However, this process is very cumbersome and impractical for most applications. Due to the powerful cloud computing, medical institutions hope that the cloud server can complete the retrieval function instead of doing it themselves. But if the key is sent to the cloud server, the patient’s private data still has the risk of exposure.

To address the above security issues, the conception of (symmetric-key) searchable encryption (SE) was proposed by Song et al. [3]. It is a powerful technology that allows the cloud server to search on encrypted data using some search trapdoors generated by the local data users. In 2004, Boneh et al. [4] proposed a public-key version of SE, namely, Public-key Encryption with Keyword Search (PEKS). This scheme embeds keywords in public-key encryption and is very suitable for scenarios of a multiuser data sharing setting, e.g., medical data sharing. There are three parties in the PEKS scheme: cloud server, data sender, and data receiver. The sender (e.g., patient) has a lot of privacy files and wants to share them with the receiver (e.g., doctor). First, the sender extracts the keyword from each file , encrypts the keyword with the PEKS scheme, and then encrypts each file with other encryption schemes (not necessarily the same as the PEKS scheme). Let the keyword cipher text be . The sender uploads all cipher texts to the TCS. In order to search whether there is a document containing the keyword in the encrypted document, the receiver generates a search trapdoor of the keyword and sends the trapdoor to the cloud server. After the server receives , it checks whether each keyword cipher text matches with the search trapdoor. If so, it indicates that the corresponding encrypted document must contain the desired keyword. After that, the results are returned to the receiver, and the receiver can get the required plaintext data by decrypting the encrypted documents.

As mentioned in Figure 1, we will apply searchable encryption to telemedicine services, where the patient is the sender and the medical service provider is the receiver. Each patient can encrypt and upload their own electronic medical record to the cloud server. When the patient wants to see a doctor remotely, the doctor can retrieve the medical data information related to some disease on the third-party cloud server according to the keyword information of the patient. In this process, doctors will only get data related to a certain disease and will not expose other information (such as name) of patients.

However, PEKS inherently has a disadvantage to resist against keyword guessing attacks (KGA). Ideally, a keyword space can be considered infinite. In practice, however, this is not the case. In real life, users often use a limited number of keywords because of their living habits, which leads to the transformation of the original polynomial space into an affixed and low-entropy space. In this case, the adversary can guess the keywords contained in the searching trapdoor as follows: First, the adversary guesses all the keyword spaces of the user and then generates keyword cipher text one by one. The adversary checks the trapdoor requested by the user one by one with keyword cipher texts generated by itself. If there is coincidentally the same situation, the adversary can obtain the keyword information retrieved by the user, thus exposing the privacy of the user. This kind of attack can be easily mounted by the cloud server, as the cloud server has users’ searching trapdoors. Such attack is often called inside keyword guessing attacks (IKGA).

To resist against KGA is very challenging. Recently, many methods [511] were proposed to prevent KGA on PEKS schemes; however, most of them were later proven insecure [1215]. In 2017, Huang and Li [16] proposed a new primitive, namely, Public-key Authenticated Encryption with Keyword Search (PAEKS), to solve the problem of inside KGA. In PAEKS, the data sender not only encrypts a keyword but also authenticates it, so that a search trapdoor can only match with the corresponding data sender. PAEKS is also applicable to cloud-assisted MIoT, as in general, the doctor just searches on a designated patient’s medical data. However, the proposed concrete PAEKS scheme still has some security issues [1719]. In particular, Noroozi and Eslami [18] pointed out that it cannot handle multiuser settings and provided an improvement security model for PAEKS in a multiuser setting.

1.1. Our Contribution

In this paper, we research on new and efficient construction of PAEKS schemes in a multiuser setting for cloud-assisted MIoT. Our main contributions are as follows: (i)We observe that in PAEKS, both the data sender and data receiver hold a pair of public/secret keys. If they can compute a shared key without any interaction, then the shared key can be viewed as a secret key of a symmetric searchable encryption scheme. Inspired by this, we propose an efficient PAEKS scheme, which involves the (noninteractive) Diffie-Hellman key exchange scheme to compute the shared key and Song et al.’s SSE scheme to encrypt keywords. It removes the usage of time-consuming operation of pairing in previous PAEKS schemes(ii)We show that our scheme is semantically secure against IKGA in a multiuser setting under the oracle Diffie-Hellman assumption [20]. Specially, it satisfies both cipher text indistinguishability and trapdoor indistinguishability(iii)We compare our scheme with some related PAEKS scheme in terms of security and computation efficiency and also do some experiments to demonstrate the efficiency of our schemes for protecting the privacy of cloud-assisted MIoT data. Experiment results show that our scheme is more efficient than that of previous ones, especially in terms of keyword searching time

1.2. Paper Organization

In the next section, we will briefly introduce some cryptographic primitives. Our main construction of the PAEKS scheme and its security proof are given in Section 3. In Section 4, we compare the efficiency of our scheme with that of other related PAEKS schemes. Finally, we summarize the paper in Section 5.

2. Preliminaries

In this section, we recall some basic conceptions of cryptographic primitives that will be used in this paper, including cyclic group, hardness assumption, pseudorandom functions, syntax of PAEKS, and its security model.

2.1. Cyclic Group

Let be a group with order . We say that is a cyclic group, if the group can be generated by a single element . That is, every element has the form for some exponent . We call to be a generator of the group. In our scheme, we use a cyclic group with a prime order; i.e., is a prime. In this case, any group element except the identity will be a generator.

2.2. Oracle Diffie-Hellman (ODH) Problem [20]

Let be a cyclic group with prime order and a generator . Let be a hash function from to some -bit length space . The ODH problem states that given a tuple and an oracle , to decide whether is or a random string from , here, and are randomly chosen from , and the oracle returns for each . Let be any probabilistic polynomial time (PPT) algorithm. We say that breaks the ODH problem over group and with advantage at most , if

Definition 1 (ODH assumption). We say that the ODH assumption holds over group and , if for any PPT algorithm , its advantage in solving the ODH problem is negligible in (the bit length of ).

2.3. Pseudorandom Functions (PRFs)

A pseudorandom function is a family of functions such that for a random choice from the family, its input/output behavior is computationally indistinguishable from that of a random function. A formal definition of PRFs is given below.

Definition 2 (PRFs). Let be a family of functions indexed with key space from to . We say that is an if (1)Given a key and an input , there is an efficient algorithm to compute the output (2)For any PPT algorithm that makes at most polynomial number of oracle queries, the following advantage is at most : where and the oracles are given an input and output the corresponding image of the function.

The above definition indicates that, given any polynomial number of valid input/output pairs , no PPT adversary can predicate for a new and distinct input x. Specifically, is computationally indistinguishable from a random .

2.4. PAEKS and Security Model

The notion of Public-key Authenticated Encryption with Keyword Search (PAEKS) was first proposed in [16] to protect the privacy of a keyword against inside keyword guessing attacks. It involves the public/secret key pair into the cipher text to prevent keyword guessing attacks by the insider server. We first recall its definition.

Definition 3 (syntax of PAEKS). A PAEKS scheme consists of the following six PPT algorithms: (i)Setup (λ). This is the global parameter generation algorithm. It takes the security parameter as input and outputs global system parameter (ii)(). This is the sender’s key generation algorithm. It takes the global system parameter as input and outputs a public/secret key pair (iii)(). This is the receiver’s key generation algorithm. It takes the global system parameter as input and outputs a public/secret key pair (iv)PAEKS. This is the keyword encryption algorithm performed by the sender. It takes the sender’s secret key , the receiver’s public key , and a keyword as input and outputs a PAEKS cipher text of the keyword (v)Trapdoor. This is the trapdoor generation algorithm performed by the receiver. It takes the receiver’s secret key , the sender’s public key , and a keyword as input and outputs a trapdoor (vi)Test. This is the test algorithm performed by the cloud server. It takes a trapdoor , a PAEKS cipher text , the sender’s public key , and the receiver’s public key as input and outputs 1 if and contain the same keyword and 0 otherwise

Next, we recall the improved security model for PAEKS in a multiuser setting by Noroozi and Eslami [18]. It includes trapdoor indistinguishability (TI) and cipher text indistinguishability (CI). Both of them are described through games played between an adversary and the challenger .

Definition 4 (TI security game). The TI security game is described as follows: (i)Initialization. Given a security parameter , the challenger generates the global system parameter. Then, the challenger generates the receiver’s public/secret keys and the sender’s public/secret keys . It executes the adversary on input (ii)Phase 1. The adversary is permitted to adaptively query the following two oracles polynomial times:(a)Cipher Text Oracle. Given a keyword and a public key , the challenger computes the cipher text by running the algorithm PAEKS and returns the cipher text to (b)Trapdoor Oracle. Given a keyword and a public key , the challenger computes the trapdoor by running the algorithm trapdoor and returns the trapdoor to (iii)Challenge. When phase 1 ends, the adversary outputs two challenge keywords and , which have not been queried to the oracles and before. Now, the challenger chooses a random bit , computes the , and returns it to the adversary (iv)Phase 2. In this phase, the adversary can continue to access the oracles, with the restriction that neither nor could be queried to the oracles and (v)Guessing. Finally, the adversary outputs a bit as the guess of . If , we say that wins the gameWe define ’s advantage in breaking the TI security of PAEKS as

Definition 5 (CI security game). Similarly, the CI security game can be described as follows: (i)Initialization. Given a security parameter , the challenger generates the global system parameter . Then, the challenger generates the receiver’s public/secret keys and the sender’s public/secret keys . It executes the adversary on input (ii)Phase 1. The adversary is allowed to adaptively query the following two oracles polynomial times:(a)Cipher Text Oracle. Given a keyword and a public key , the challenger computes the cipher text by running the algorithm PAEKS and returns the cipher text to (b)Trapdoor Oracle. Given a keyword and a public key , the challenger computes the trapdoor by running the algorithm trapdoor and returns the trapdoor to (iii)Challenge. When phase 1 ends, the adversary outputs two challenge keywords and , which have not been queried to the oracles and before. Now, the challenger chooses a random bit , computes the , and returns it to the adversary (iv)Phase 2. In this phase, the adversary can continue to access the oracles, with the restriction that neither nor could be queried to the oracles and (v)Guessing. Finally, the adversary outputs a bit as the guess of . If , we say that wins the gameWe define ’s advantage in breaking the CI security of PAEKS as

If for any PPT adversary , both and are negligible in the security parameter ; we say that the PAEKS is semantically secure against inside keyword guessing attacks.

3. Our PAEKS Scheme

In this section, we introduce a PAEKS scheme for an electronic medical record system. The system framework is given in Figure 2.

3.1. The Construction

Our PAEKS scheme is described as follows: (i)Setup. Select a cyclic group with prime order and a random generator of . Select three pseudorandom functions: , , and , where , , and are the key spaces of the three PRFs, respectively, and is the keyword space. Let be a hash function, defined as . Finally, return (ii)(). Randomly select , and set and . Return and (iii)(). Randomly select , and set and . Return and (iv)PAEKS. To encrypt a keyword , do the following:(a)Compute the keys (b)Compute and (c)Select a random string and set (d)Set (e)Finally, return (v)Trapdoor. Compute and . Return the trapdoor (vi)Test. Compute and parse it as . If holds, return 1; otherwise, return 0

3.1.1. Correctness

Let the receiver’s key pair be and the sender’s key pair be . Then, the key can be generated by each other. Let be a cipher text of keyword generated by the sender and be the corresponding search trapdoor generated by the receiver. According to the keyword encryption algorithm, there must exist two strings and and a random string such that , , and , where . For a right trapdoor of keyword , it should be in the form , where and . So, . Let be the first bits of and be the last bits. Clearly, will hold. Thus, for the same keyword, the cipher text will match with the corresponding trapdoor.

Fixing a cipher text of a distinct keyword , we have , for some . Since is a pseudorandom function, then is a random string over with probability at least . In this case, will be a random string. Since is also a pseudorandom function, for a random string , the equation holds with probability at most . Thus, the cipher text matches with the search trapdoor with a negligible probability. So our PAEKS scheme satisfies the correctness.

3.2. Security Proof

In this section, we prove that our PAEKS scheme satisfies both trapdoor indistinguishability and cipher text indistinguishability. Its trapdoor indistinguishability follows from the theorem below.

Theorem 6. If the oracle Diffie-Hellman assumption holds and is a pseudorandom function, then our PAEKS scheme achieves trapdoor indistinguishability. Specifically, for any PPT adversary , we have where and are the advantages to break the ODH assumption and the pseudorandomness of the PRF .

Proof. Let be any PPT adversary that aims to break the security of trapdoor indistinguishability of our PAEKS scheme. We prove Theorem 6 by a sequence of games. Let denote the event that succeeds (i.e., ) in the -th game.
Game 0. This is the original trapdoor in a distinguishability game as defined in Definition 4. In this game, the challenger generates two public/secret key pairs and for the sender and the receiver, respectively, and gives the public keys to . In addition, the adversary can adaptively issue queries to the trapdoor oracle and cipher text oracle with any keyword and public key . But, for the two challenge keywords and , the adversary cannot submit them to the oracles and . Let denote the challenge trapdoor of , where . Let denote the guess of by . So, ’s advantage in this game is Game 1. This game is the same as the previous game with the exception of being sampled from uniformly at random. Recall that in the previous game, the challenger computes by (namely, ) according to the keyword encryption algorithm (namely, trapdoor generation algorithm). We now prove that Given an instance of the oracle Diffie-Hellman problem , where or is a random string from , we construct an algorithm to solve it using as a subroutine. sets and and gives them to . The corresponding secret keys are implicitly set to be and , respectively. In addition, chooses the other system parameters, including , by itself. Parse as . When issues queries to the oracles and with , involves the oracle or to obtain the shared key . When issues queries to the oracles and , uses to generate cipher texts and trapdoors. For example, for a keyword , computes the cipher text as follows: (1)Compute and (2)Select a random string and set (3)Set Given two challenge keywords and , computes the challenge trapdoor as follows: (1)Choose a random bit (2)Compute and (3)Set Finally, outputs a bit as a guess of . If , outputs 1; otherwise, outputs 0.
Clearly, if , the above game is identical to Game 0. Otherwise, it is identical to Game 1. So, This proves the result of Equation (7).
Game 2. This game is identical to the previous game with the exception of being sampled randomly from . Assuming that is a pseudorandom function, we have We now prove Equation (9). Given a challenge pseudorandom function , we construct an algorithm to break its pseudorandomness using as a subroutine. chooses the system parameter, the sender and receiver’s public/secret key pairs, as in the previous game, with the exception of being provided by its own challenger. Specifically, the random string is chosen by itself, but is implicitly defined by the secret key of the challenge pseudorandom function . Next, we show how answers ’s queries of cipher texts and trapdoors with or , respectively. For a keyword , computes its cipher text as follows: (1)Query the challenger of with to obtain the result (2)Compute (3)Select a random string and compute (4)Set computes its trapdoor as follows: (1)Submit to its own challenger to obtain the result (2)Compute (3)Set When submits two challenge keywords and , picks a random bit and sends to the oracle of PRF for challenging. The PRF challenger will return the challenge PRF value to , which may be or a random value. then computes and returns to the adversary. Finally, outputs a guess bit . If , outputs 1; otherwise, it outputs 0.
From the above analysis, it is clear that if , actually simulates an environment of Game 1 for the adversary . If is random, the simulated environment is identical to Game 2. Thus, if ’s success probability between Game 1 and Game 2 has difference , then can distinguish from a random one with the same advantage. This computes the proof of Equation (9).
Note that in Game 2, the challenge trapdoor is independent of the two challenge keywords. So, the adversary has no success advantage in this game, i.e., Taking Equations (6) to (10) together, it follows that This completes the proof of Theorem 6.

The cipher text indistinguishability of our PAEKS scheme follows from the theorem below.

Theorem 7. If the oracle Diffie-Hellman assumption holds and are pseudorandom functions, then our PAEKS scheme achieves cipher text indistinguishability. Specifically, for any PPT adversary , we have where , , and are the advantages to break the ODH assumption and the pseudorandomness of the PRFs and , respectively.

Proof. Similar to the proof of Theorem 6, we prove the above theorem also via a sequence of games. In each game, is a PPT adversary, aiming to break the cipher text indistinguishability of our PAEKS scheme. is the challenge random bit, selected by the challenger, and is ’s guess bit. We denote the event that in each game as .
Game 0. This is the original cipher text indistinguishability game as defined in Definition 5. So, Game 1. This game is the same as Game 0, except that the value is chosen randomly from . Under the ODH assumption, these two games are computationally indistinguishable, i.e., The proof of the above equation is similar to that of Equation (7); we omit it here.
Game 2. This game is identical to Game 1, except the following modification to the challenge cipher text. Suppose that is the challenge keyword and is the corresponding internal value of the cipher text. In this game, is selected randomly from , instead of being computed via . Note that, for normal keyword cipher text, is still computed from . Under the assumption that is a pseudorandom function, these two games are computationally indistinguishable. Specially, we have The proof of the above equation is similar to that of Equation (9); we omit it here.
Game 3. In this game, we replace the challenge value with a random string . Recall that, in this game, is sampled uniformly from . By the pseudorandomness of PRF , is computationally indistinguishable from a random -bit string. Similarly, we can prove that In Game 3, is random and is independent of the challenge keywords. So, the adversary has no advantage in this game, i.e., Taking Equations (13) to (17) together, we complete the proof of Theorem 7.

From Theorems 6 and 7, we conclude that our PAEKS scheme is semantically secure against inside keyword guessing attack assuming that the ODH problem is hard and are PRFs.

4. Experiments and Efficiency Comparison

In this section, we analyze the efficiency of our PAEKS scheme and compare it with some other related schemes, including Boneh et al.’s PEKS scheme [4] and PAEKS schemes of [16, 18, 19]. Except our scheme, all the others are designed in bilinear groups. That is, besides group , there are another group and a bilinear map defined from to .

Table 1 demonstrates the theoretical result of efficiency comparison in terms of keyword encryption, trapdoor generation, testing, and two security properties. In the table, we use symbols “” and “” to denote the evaluation of a modular exponentiation and a bilinear pairing, respectively. “” denotes a special hash function that maps an arbitrary string to a group element, while “” denotes a traditional hash function, e.g., MD5. We denote the pseudorandom function as “.

Figure 3 shows the length of each parameter in different PEKS/PAEKS schemes. With the exception of Boneh et al.’s scheme, the other three schemes involve the sender’s public key and secret key in the keyword encryption algorithm and trapdoor generation algorithm, respectively. It can be seen from the figure that our scheme has shorter trapdoor and cipher text than other schemes. For the other parameters, our scheme still has comparable length with other schemes.

Among these operations, the computation of the pairing is usually the most time-consuming. According to the construction of in [21], its computation is usually inefficient with the comparison of the traditional hash function. In a random oracle model, it is easy to construct a PRF from an efficient hash function. From these observations, we can see that our keyword testing algorithm should be much faster than that of the other three schemes. For encryption and the trapdoor generation, the advantage of our scheme is not obvious among them. In terms of security, Boneh et al.’s scheme cannot resist against IKGA. The scheme of [16] can prevent IKGA, but it is not secure in a multiuser setting. The scheme of [19] did not show its security in a multiuser setting.

To evaluate the efficiency of these schemes in practice, we use a laptop with 1.7 GHz Intel i3 CPU, 2 GB memory, and a Windows 7 operating system to implement them. We use the jPBC library and choose a type A pairing, which makes use of the curve over the field for prime . We run each algorithm with different times and record their time in seconds. The results are shown in Figures 4, 5, and 6, respectively. As the computations of Noroozi and Eslami and Huang and Li, they possess the same experimental results. Experiment results show that our encryption algorithm and trapdoor generation algorithm are slightly faster than those of the other schemes. But our keyword testing algorithm is significantly faster than that of the other schemes.

5. Conclusion

In this paper, we proposed a new public-key authenticated encryption scheme with keyword search. Our scheme uses the idea of the Diffie-Hellman key exchange protocol to generate a shared secret key between the sender and the receiver. The shared key can be viewed as the secret key of a symmetric-key searchable encryption scheme to encrypt keywords by the sender or to generate search trapdoors by the receiver. Under the ODH assumption, our PAEKS scheme can achieve both trapdoor indistinguishability and cipher text indistinguishability, and hence, it can resist inside keyword guessing attacks. The scheme is also efficient. Specifically, its keyword searching algorithm is very fast in the sense that it requires only one computation of PRF, while the previous schemes require at least one expensive pairing operation.

Data Availability

The data used to support the findings of this study are embedded in the programming. They are available from the corresponding author upon request (email: [email protected]).

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work is supported by the National Natural Science Foundation of China (grant numbers 61872292 and 61772418), the Key Research and Development Program of Shaanxi (grant number 2020ZDLGY08-04), and the Basic Research Program of Qinghai Province (grant number 2020-ZJ-701).