Abstract

Along with the fast development of wireless technologies, smart devices have become an integral part of our daily life. Authentication is one of the most common and effective methods for these smart devices to prevent unauthorized access. Moreover, smart devices tend to have limited computing power, and they may possess sensitive data. In this paper, we investigate performing graph operations in a privacy-preserving manner, which can be used for anonymous authentication for smart devices. We propose two protocols that allow two parties to jointly compute the intersection and union of their private graphs. Our protocols utilize homomorphic encryption to prevent information leakage during the process, and we provide security proofs of the protocols in the semihonest setting. At last, we implement and evaluate the efficiency of our protocols through experiments on real-world graph data.

1. Introduction

With the rapid development of IoT technology, we are surrounded by various types of smart devices in our daily life, such as sensors, wearable devices, and smart vehicles [1]. Authentication is one of the most important mechanisms to provide security protection for these smart devices [2], and authentication for light-weighted devices has become a hot research topic in the past years [3, 4].

In recent years, researchers have proposed several mobile authentication schemes based on graph data structure and graph algorithms [57]. Graph data and graph processing are well studied for the last decades [8, 9], since they can help to solve many practical problems in different application areas, such as web data processing [10], data mining [11], social networking [12], biological networking [13], and communication networking [14].

1.1. Motivation

In this paper, we consider the problem of computing graph operations between two parties while preventing information leakage, which has great potential in smart device authentication. For example, when the mobile devices communicate with cloud servers, they need to first jointly perform identity authentication for security protection. Since the mobile devices may contain sensitive information of the users and the cloud servers cannot be fully trusted in general, the privacy leakage problem for mobile authentication has become a security threat [15]. In order to protect the privacies of the mobile devices, the devices can model their identities and properties as graph-structured data, and the cloud servers can model their authentication policies as graph-structured data as well. After that, the identity authentication process can be converted into performing graph operations in a privacy-preserving manner.

1.2. Our Contributions

We study the problem of performing graph intersection and union while protecting the privacies of the input graphs. Suppose that for two parties, Alice and Bob, each has a private graph, denoted as and , respectively. Alice wishes to learn the intersection and union of these two graphs. In other words, Alice wishes to learn and . In addition, both Alice and Bob do not wish to reveal any information about their graphs to the other party. The contributions of this paper can be summarized as below: (i)We present two graph operation protocols between two parties, a server and a client. The first protocol allows the server and the client to jointly compute the intersection of their input graphs, and the second protocol computes the union of the input graphs. Our constructions first use the Paillier cryptosystem and oblivious polynomial evaluation to compute the intersection and the union of the vertices. After that, we use the homomorphic property of the Paillier cryptosystem to compute the edge intersection and union(ii)We provide the security models of the protocols, and we prove that the protocols are secure in the semihonest setting. Furthermore, we analyze the information leakage and propose methods to minimize the leakages(iii)We discuss the efficiencies of the protocols in terms of computation costs and communication costs. At last, we implement our constructions and perform experiments on real-world graph data

An earlier version of this paper was presented at the 22nd Australasian Conference on Information Security and Privacy, 2017 [16]. The previous work presented a private graph intersection protocol with rough analysis. This paper extends the previous work by presenting a private graph union protocol with detailed analysis and experimental results.

There are many different approaches to construct authentication schemes for smart devices. Among them, graph-based authentication schemes are widely used in IoT [5, 17, 18]. In 2002, Micali and Rivest [19] first introduced the transitive signature based on graph theory, which provides an unforgeable signature for undirected graphs. After that, various graph-based signature and authentication schemes were proposed [57]. In 2017, Chuang et al. [5] proposed an authentication system in Internet of Things based on multigraph zero-knowledge. The system provides suitable security protection for IoT authentication services. The proposed multigraph zero-knowledge procedure is faster than traditional zero-knowledge methods and ECC-based solutions. The experiment results indicate that the system is light-weighted and highly adaptive. Lin et al. [6] proposed a transitively graph authentication scheme for blockchain-based identity management systems in 2018. The system is used to bind a digital identity object to its real-world entity, therefore achieving identity authentication. The system is constructed based on transitively closed undirected graphs and vertex signatures. According to the evaluation results, the system is efficient, even when the graph dynamically adds or deletes vertices and edges. In 2019, Shao et al. [7] proposed a multifactor authentication scheme using a fuzzy graph domination model. The scheme is adaptive choosing one or multiple privacy-preserving identities to authenticate the users. The authors designed a weighted vertex-edge dominating set to solve the weighted domination problem on fuzzy graphs. Compared to existing solutions, the scheme is more efficient for solving instances with moderate orders.

In this work, we consider the problem of performing graph intersection and union in the privacy-preserving manner and proposed two secure multiparty computation protocols. Secure Multiparty computation (MPC) has been extensively studied over the past decades. Generally speaking, MPC allows multiple participants to jointly perform certain computations without losing the privacy of their input data, even when some players cheat during the process. MPC was first formally introduced by Yao in 1982 [20] and extended by Goldreich et al. [21]. Their works convert certain computation problems into a combinatorial circuit, then the parties perform computations over the gates in the circuit. After that, a large number of MPC protocols have been proposed to solve various problems, such as privacy-preserving set operations [22] and private information retrieval [23].

3. Preliminary

In this section, we present the preliminaries related to our proposed protocols. First, we present the relevant notations that we used in this paper in Table 1.

3.1. Additive Homomorphic Encryption

Homomorphic encryption schemes allow the users to perform certain computation operations on the ciphertext space, such as addition and multiplication. In our private graph operation protocols, we utilize an additive homomorphic encryption scheme called the Paillier cryptosystem, proposed by Paillier in 1999 [24]. The Paillier cryptosystem contains three algorithms, described as follows:

is the key generation algorithm. The input is a security parameter . The outputs are a public key and a secret key . The public key contains a large number which specifies the message space, the ciphertext space, and the random space to be , , and , respectively.

is the encryption algorithm. The input is the public key , a plaintext , and a random number . The output is the ciphertext . For simplicity, we use the notion .

is the decryption algorithm. The input is the secret key and a ciphertext . The output is the plaintext . For simplicity, we use the notion .

The Paillier cryptosystem has the following properties:

3.1.1. Correctness

For any key pairs and any plaintext , always holds.

3.1.2. IND-CPA Security

Two ciphertexts and are indistinguishable for probabilistic polynomial-time adversaries that only have access to the public parameters.

3.1.3. Homomorphic Property

For any two plaintexts , there exists an operation in the ciphertext space, such that . Furthermore, there exists another operation in the ciphertext space, such that .

3.2. Private Set Intersection

Private Set Intersection (PSI) is a cryptographic protocol that allows two parties, each holding a private set, to jointly compute the intersection of their sets without leaking any additional information. The first secure two-party private set intersection protocol is introduced by Freedman, Nissim, and Pinkas (FNP) in 2004 [25]. The protocol utilizes homomorphic encryption and oblivious polynomial evaluation to ensure each party learns no information about the other party’s private input during the computation. Later, several other protocols have been proposed with different features and security levels [2628].

3.3. Graph Representation

In our protocol, we represent a graph as , where is the vertex collection and is the edge collection. We represent the vertex collection as a sorted set with ascending order, , where is the number of vertices in , , and for . We represent the edge collection as an adjacency matrix, where is the adjacency relation between the vertices and , and . If vertices and are adjacent, i.e., there is at least one edge that connects them, ; otherwise, . Note that is a square matrix with rows and columns. For an undirected graph, is a symmetric matrix, since the edges are two-way.

For example, we represent the directed graph illustrated in Figure 1 as , where and

4. Definitions and Security Models

4.1. Protocol Definitions

We formally describe the private graph intersection (PGI) protocol and the private graph union (PGU) protocol. The protocols involve two participants, a server and a client, denoted as and , respectively. Each of the participants holds a private graph, which is intended to be kept secret from the other participant.

We denote the graphs of the server and client as and , respectively, where and are the sets of vertices and edges of the graphs. The intersection of and is defined as , where and . The union of and is defined as , where and .

PGI and PGU allow the participants to jointly compute and , respectively, in a privacy-preserving manner. At the end of the protocols, only the server learns the result. The formal definitions of PGI and PGU are described as follows:

Definition 1 (private graph intersection protocol). If both participants are honest, for any and any , the private graph intersection protocol computes . At the end of the protocol, only learns .

Definition 2 (private graph union protocol). If both participants are honest, for any and any , the private graph union protocol computes . At the end of the protocol, only learns .

4.2. Security Models

When considering privacy protecting in authentication, the term privacy may have different definitions and properties, such as user identity and untraceability [29, 30]. In this work, the privacies of the server and the client refer to any information about their graphs. Therefore, any information about the vertices and edges of the graphs is considered as private, such as the number of vertices, the number of edges, the values of the vertices, and whether two vertices are connected by an edge.

The security goals of both PGI and PGU protocols are protecting the privacies of both the server and the client during the computation. In other words, both the server and the client should learn no information about the graph of the other party.

We use the semihonest security model for both PGI and PGU, which means both the server and the client perform the protocols faithfully, but they may try to learn any information about the graph of the other participant. The security models are adopted from the work of [3133].

While achieving no information leakage is the ideal goal, our protocols leak partial information during the process. The information leakages for PGI are defined as leakage functions and , and the information leakages for PGU are defined as and . The detailed information about the leakage functions are as follows: is the number of vertices in , is the vertex intersection and the number of vertices in , is the number of vertices in and the number of common vertices between and , and is the vertex union and the number of vertices in .

The formal definitions of security models are described as follows:

Definition 3 (PGI security). A semihonest server learns nothing about the client’s graph, beyond what can be deduced from and the leakage function , and a semihonest client learns nothing about the server’s graph, beyond the leakage function .

Definition 4 (PGU security). A semihonest server learns nothing about the client’s graph, beyond what can be deduced from and the leakage function , and a semihonest client learns nothing about the server’s graph, beyond the leakage function .

5. Protocol Construction

In this section, we propose the constructions of PGI and PGU. The graphs of the server and the client are represented as and , respectively, where , ,

5.1. PGI Construction

We use the FNP protocol [25] as a building block for computing the vertex intersection. The private graph intersection protocol is described below:

Input: and hold the graphs and , respectively.

Output: learns .

Protocol:

Step 1. runs the key generation algorithm of the Paillier cryptosystem, , and obtains the public key and the secret key. Then, publishes .

Step 2. (a) constructs a polynomial , such that all the roots of are exactly the elements in . In other words, if and only if (b) encrypts each , for , under the Paillier cryptosystem, and sends the set of ciphertexts to

Step 3. (a)By using the homomorphic properties of the Paillier cryptosystem, evaluates the polynomial using each element in as input. In other words, computes , for (b)For each polynomial evaluation, chooses a random value and computes . Then, sends to

Step 4. decrypts all the ciphertexts received and compares the decrypted values with his vertex set . If a decrypted value has a corresponding element in , it is an element of the intersection of and . In other words, if , . After decrypting all the received ciphertexts, the server obtains .

Step 5. (a) uses to construct an adjacency matrix of size , where is the number of the vertex in : has the property that, for each vertex pair and , if an edge exists in between vertices and , ; otherwise, . (b) encrypts each element in under the Paillier cryptosystem and obtains an encrypted matrix (c) sends and to

Step 6. (a)By using , constructs an adjacency matrix using the same method in the last step:(b) computes(c) sends to

Step 7. decrypts each element in and obtains . At last, obtains .

5.2. PGU Construction

The private graph union protocol is described below:

Input: and hold the graphs and , respectively.

Output: learns .

Protocol:

Step 1. Same as Step 1 of PGI.

Step 2. Same as Step 2 of PGI.

Step 3. (a)By using the homomorphic properties of the Paillier cryptosystem, evaluates the polynomial using each element in as input. In other words, computes , for (b)For each polynomial evaluation, choose a random value and computes . Then, sends the set of all resulting ciphertexts to

Step 4. decrypts each ciphertext received as and checks the decrypted value. If , computes ; otherwise, computes . Then, sends to .

Step 5. After receiving , computes , for . Then, sends to .

Step 6. (a) decryptes each value in and checks if the decrypted value is zero(b)By combining the server’s vertex set and the set of nonzero decrypted values , obtains . is then sorted in ascending order and is represented as

Step 7. (a) uses to construct an adjacency matrix of size , where is the number of vertex in : has the property that, for each vertex pair and , if an edge exists in between vertices and , ; otherwise, (b) encrypts each element in under the Paillier cryptosystem and sends the encrypted matrix and to

Step 8. (a) uses to construct an adjacency matrix in the same manner as in the last step:(b) encrypts each element in using the Paillier cryptosystem and obtains (c) generates a matrix with random values:(d) computes:(e) cends to

Step 9. decrypts the matrix . For each decrypted element , if , set . At last, obtains .

6. Analysis

6.1. Security Analysis

In this section, we prove the correctness and security of both PGI and PGU. When analyzing the security of the proposed protocols, we assume both the server and the client evaluate the protocols faithfully, but they may try to obtain as much information about the graph of the other party as possible. The security analysis for the protocols is divided into two cases, where one of the server and the client acts as the adversary in each case. Then, we prove the zero-knowledge properties of the server and the client in each case, using the methods and techniques introduced in [15, 34].

Lemma 5 (PGI correctness). If both participants are honest, for any and any , the private graph intersection protocol computes .

Proof. The correctness of PGI is ensured by the correctness of the FNP protocol and the homomorphic property of the Paillier cryptosystem.

During Steps 2 to 4 of the protocol, the client and the server jointly perform a FNP protocol using their vertex collections as inputs. At the end of Step 4, the server learns the vertex intersection , and the client receives from the server in Step 5.

In Steps 5 and 6, the server and the client construct two adjacency matrices by using , denoted as and , respectively. Note that and contain the adjacency relations between the vertices in for graphs and , respectively. In other words, if an edge exists between two vertices in , it leads to a value of 1 in the corresponding position of the constructed adjacency matrix; otherwise, it leads to a value of 0 instead. Therefore, the dot product of and will produce an adjacency matrix that represents the edge intersection. If an edge exists in both and , i.e., it is a common edge between and , the dot product of its adjacency relations will result a value of 1. If an edge only exists in one of and , or the edge does not exist at all, the dot product will result in a value of 0.

In Step 6, the client receives the encryption of under the Paillier cryptosystem from the server. If the Paillier cryptosystem has the homomorphic property, i.e., it supports multiplication between a ciphertext and a constant, the client can homomorphically compute the dot product of the and , and the result is the encryption of the edge intersection. Finally, in Step 7, the server obtains the edge intersection after decryption.

As a result, if the FNP protocol is correct and the Paillier cryptosystem has the homomorphic property, the private graph intersection protocol computes .

Lemma 6 (PGI server zero-knowledge). A semihonest server learns nothing about the client’s graph, beyond what can be deduced from and the leakage function .

Proof. The proof of PGI server zero-knowledge is trivial. During PGI, there are two parts where the server receives information about the client’s graph. The first part is during the FNP protocol in Step 3, and the second part is at the end of Step 6.

For the first part, in Step 3, the server receives a set of ciphertexts from the client. The server can learn the number of vertices in the client’s graph by counting the number of ciphertexts, which is the predefined leakage function . By decrypting the ciphertexts, the server obtains a set of values. If a value exists in , it is a common vertex between and , which is a part of the final result of the protocol. Otherwise, if the value does not exist in , it will be a random value, which has no relation to the client’s graph.

For the second part, the server receives from the client, which is the ciphertext of the edge intersection. Upon decryption, the server only learns the edge intersection. As a result, the PGI server zero-knowledge holds.

Lemma 7 (PGI client zero-knowledge). A semihonest client learns nothing about the server’s graph, beyond the leakage function .

Proof. There are two parts where the client receives information about the server’s graph. The first part is during the FNP protocol in Step 2, and the second part is at the end of Step 5.

For the first part, the client receives a set of encrypted coefficients of the polynomial from the server. The client can learn the number of vertices of the server’s graph by counting the number of encrypted coefficients received, which is a part of the predefined leakage function .

For the second part, the client receives an encrypted matrix and the vertex intersection . Since is also a part of the predefined leakage function , we need to show that does not reveal any information about the server’s graph. According to the protocol construction, contains the encryptions of adjacency relations between the vertices in for the server’s graph. Therefore, if the client cannot distinguish between the cases where the server has different input graphs, given the knowledge of and , the PGI client zero-knowledge holds. Consider the following experiment:

In the above experiment, is a probabilistic polynomial-time adversarial client with a private graph . The adversary first chooses two graphs, denoted as and , respectively. The two graphs have the property that and . then sends the graphs to the server. The server randomly picks a bit , and chooses as the private graph. After that, the server and jointly perform the private graph intersection protocol from Steps 1 to 5.

At the end of Step 5, needs to output a bit , using the information he recevied during the protocol. If , the experiment outputs 1; otherwise, it outputs 0. The advantage of the above experiment for is defined as .

During PGI, the information that receives contains , , and . contains a set of ciphertexts under the Paillier cryptosystem, is the vertex intersection, and is an encrypted adjacency matrix under the Paillier cryptosystem.

Due to the condition , the vertex intersection gives no useful information since will be the same for both and . Since the Paillier cryptosystem is IND-CPA secure and cannot decrypt the ciphertexts without the private key, and cannot help to distinguish which graph the server has chosen. As a result, if the Paillier cryptosystem is IND-CPA secure, the advantage of the above experiment for is negligible, i.e., , where is negligible.

At last, we construct a simulator to simulate the view of the client in the ideal model. is given the knowledge of the vertex intersection and the vertex number of the server’s graph. In the above experiment, sends a set of random values to the client in Step 2 and sends and a matrix with random values to the client in Step 5. Since the client cannot distinguish between the ciphertexts under the Paillier cryptosystem and random values, the view of the client in the ideal model is computationally indistinguishable from the view in the real model, i.e., . As a result, the PGI client zero-knowledge holds.

Lemma 8 (PGU correctness). If both participants are honest, for any and any , the private graph union protocol computes .

Proof. The correctness of PGU is ensured by the homomorphic property of the Paillier cryptosystem. Steps 2 of PGU compute the vertex union, and Steps 79 compute the edge union.

In order to compute the vertex union between and , the server needs to obtain the vertices in that are not in .

In Step 2, the server constructs a polynomial, such that all the roots are exactly the vertices in . After that, the client homomorphically evaluates the polynomial using all the vertices in , and each polynomial evaluation is homomorphically multiplied by a random value. Therefore, the common vertices between and will result in encryptions of zero, and other vertices will result in encryptions of random values. In Step 4, the server decrypts all the polynomial evaluations. If the decryption is zero, the server generates an encryption of 0; otherwise, the server generates an encryption of 1. In the next step, the client homomorphically multiplies the received encryptions with the vertices in . For an encryption of 0, i.e., the vertex is a common vertex, the client will result in an encryption of 0; for an encryption of 1, i.e., the vertex is not a common vertex, the client will result in an encryption of the vertex. As a result, in Step 6, the server learns the set of vertices that only exists in . By combing the above set and , the server obtains the vertex union .

In order to compute the edge union, the server needs to obtain an adjacency matrix, such that if an edge does not exist in either and , it will have a corresponding value of 0 in the matrix; otherwise, it will have a corresponding value of 1.

In Steps 7 and 8, each of the server and the client constructs an adjacency matrix using the vertex union and his own graph and encrypts each element under the Paillier cryptosystem. The client then homomorphically adds the encrypted values at the same locations in the two matrices. There are three circumstances for the addition results. If an edge does not exist in either of the graphs, the addition will result in an encryption of 0; if an edge only exists in one of the graphs, the addition will result in an encryption of 1; if an edge exists in both of the graphs, the addition will result in an encryption of 2. Then, the client homomorphically multiplies each result by a random value. Therefore, for the edges that do not exist in either of the graphs, the final result will still be an encryption of 0; for the edges that only exist in one of the graphs and the edges that exist in both of the graphs, the final result will be encryptions of random values. Finally, in Step 9, the server decrypts the encrypted matrix and replaces all the nonzero values to 1, which is the edge union of and .

As a result, if the Paillier cryptosystem has the homomorphic property, the private graph union protocol computes .

Lemma 9 (PGU server zero-knowledge). A semihonest server learns nothing about the client’s graph, beyond what can be deduced from and the leakage function .

Proof. There are three parts where the server receives information from the client, which are Steps 3, 5, and 8.

In Step 3, the server receives a set of ciphertexts, , from the client. Each vertex in has a corresponding ciphertext in . If a vertex in also exists in , i.e., it is a common vertex in both graphs, it will result in an encryption of 0; otherwise, it will result in an encryption of a random value. By counting the number of ciphertexts in , the server can learn the number of vertices in the client’s graph, and by decrypting and counting the number of 0 s, the server can learn the number of common vertices. The above information is defined as leakage function .

In Step 5, the server receives another set of ciphertexts, , from the client. Similar as above, each vertex in has a corresponding ciphertext in . If a vertex exists in both and , it will result in an encryption of 0; otherwise, it will result in an encryption of the vertex itself. Therefore, upon decryption, the server learns of the vertices in that do not exist in , which are a part of the vertex union.

In Step 8, the server receives an encrypted matrix, , from the client. Each element of the matrix represents the adjacency relation between two vertices in the graph union. If an edge exists in at least one of the input graphs, the corresponding adjacency value will be a random number; if an edge does not exist in either of the input graphs, it will result in an adjacency value of 0. By decrypting the matrix and replacing the random values to 1, the server obtains the edge union. As a result, the PGU server zero-knowledge holds.

Lemma 10 (PGU client zero-knowledge). A semihonest client learns nothing about the server’s graph, beyond what can be deduced from and the leakage function .

Proof. There are three parts where the client receives information from the server, which are Steps 2, 4, and 7. In Step 2, the client receives a set that contains ciphertexts under the Paillier cryptosystem, which are encryptions of the coefficients of the server’s polynomial. The client can learn the vertex number of the server’s graph by counting the ciphertexts in , which is the leakage function . In Step 4, the client receives another set of ciphertexts , which contains encryptions of 1 s and 0 s. In Step 7, the client receives an encrypted matrix of size , which contains encryptions of 1 s and 0 s. In order to prove that the above information does not reveal anything about the server’s graph beyond what can be deduced from and the leakage function , consider the following experiment: In the above experiment, is a probabilistic polynomial-time adversarial client with a private graph . The adversary first chooses two graphs, denoted as and , respectively. The two graphs have the property that and . then sends the graphs to the server. The server randomly picks a bit and chooses as the private graph. After that, the server and jointly perform the private graph union protocol from Steps 1 to 7.

At the end of Step 7, needs to output a bit , using the information he received during the protocol. If , the experiment outputs 1; otherwise, it outputs 0. The advantage of the above experiment for is defined as .

During PGU, the information that receives contains , and . and are both sets of ciphertexts under the Paillier cryptosystem. Since and satisfied the condition , the numbers of ciphertexts in will be the same for both and . is a matrix filled with ciphertexts. Since the Paillier cryptosystem is IND-CPA secure and cannot decrypt the ciphertexts without the private key, , , and cannot help to distinguish which graph the server has chosen. Furthermore, since and satisfied the condition , will be the same for both and . As a result, if the Paillier cryptosystem is IND-CPA secure, the advantage of the above experiment for is negligible, i.e., , where is negligible.

At last, we construct a simulator to simulate the view of the client in the ideal model. is given the knowledge of the vertex union and the vertex number of the server’s graph. In the ideal model, genereates a set of random values in Step 2, a set of random values in Step 4, and a matrix of size filled with random values in Step 7. Since the Paillier cryptosystem is IND-CPA secure, the client cannot distinguish the ciphertexts and random values. Therefore, the view of the client in the ideal model is computationally indistinguishable from the view in the real model, i.e., . As a result, the PGU client zero-knowledge holds.

6.2. Efficiency Analysis

In this section, we analyze the efficiencies of PGI and PGU in terms of communication cost and computation cost. The communication cost is measured in terms of the amount of ciphertexts that has been transferred between the server and the client, and the computation cost is measured in terms of modular exponentiations and multiplications.

We denote as the number of vertices in , as the number of vertices in , as the number of vertices in the intersection of and , and as the number of vertices in the union of and .

6.2.1. PGI Communication Cost

The construction of PGI is simple and only requires rounds of communication. In Step 2, the server sends ciphertexts to the client. In Step 3, the client sends ciphertexts to the server. In Step 5, the server sends ciphertexts to the client. At last, in Step 6, the client sends ciphertexts to the server. As a result, the total communication cost of our protocol is ciphertexts.

6.2.2. PGI Server Computation Cost

In Step 2, constructing the polynomial requires modular multiplication, and encrypting the coefficients requires modular exponentiations. In Step 4, decrypting the received ciphertexts requires modular exponentiations. In Step 5, encrypting each element in requires modular exponentiations. In Step 7, decrypting each element in requires exponentiations. As a result, the total computation cost for the server is modular exponentiations and modular multiplications.

6.2.3. PGI Client Computation Cost

In Step 3, obliviously evaluating the polynomial requires modular exponentiations. In Step 6, computing requires modular exponentiations. As a result, the total computation cost for the client is modular exponentiations.

6.2.4. PGU Communication Cost

The construction of PGU also only requires rounds of communication. During Steps 25, the server sends ciphertexts to the client, and the client sends ciphertexts to the server. During Steps 7 and 8, the server sends ciphertexts to the client, and the client sends ciphertexts to the server. As a result, the total communication cost is ciphertexts.

6.2.5. PGU Server Computation Cost

In Step 2, constructing the polynomial requires modular multiplication, and encrypting the coefficients requires modular exponentiations. In Step 4, decrypting ciphertexts requires modular exponentiations, and encrypting ciphertexts requires modular exponentiations. In Step 6, decrypting ciphertexts requires modular exponentiations. In Step 7, encrypting each element in requires modular exponentiations. In Step 9, decrypting each element in requires modular exponentiations. As a result, the total computation cost for the server is modular exponentiations and modular multiplications.

6.2.6. PGU Client Computation Cost

In Step 3, obliviously evaluating the polynomial requires modular exponentiations. Computing homomorphic multiplication requires modular exponentiations. In Step 5, computing homomorphic multiplication requires modular exponentiations. In Step 8, encrypting the each element in requires modular exponentiations. Computing homomorphic addition and multiplication requires modular exponentiations and modular multiplication. As a result, the total computation cost for the client is modular exponentiations and modular multiplications.

6.3. Leakage Analysis
6.3.1. PGI Leakage

As stated before, the proposed PGI leaks certain information about the private graphs, which is modeled as the leakage functions and . There are several techniques that can be used to reduce the amount of information leakage; however, it cannot be completely avoided.

In Step 2, the server constructs a polynomial , such that all the roots of are exactly the elements in . After that, the server sends the encryptions of the coefficients of to the client. In order to prevent the client from learning the exact vertex number of the server’s graph, the server first randomly constructs an irreducible polynomial with degree . The server then computes and uses instead of in Step 2. The polynomial has the same property as ; therefore, it will not affect the result of the protocol. As a result, by counting the number of ciphertexts received, the client can only learn the upper bound of the vertex number of the server’s graph, i.e., .

In order to hide the exact vertex number of the client’s graph, the client can randomly generate a set of values from the message space of the Paillier cryptosystem in Step 3. After that, the client encrypts the random values and sends the encrypted random set to the server along with . Since the message space of the Paillier cryptosystem is large enough, the probability that a random value equals to an element in can be assumed as negligible. Therefore, the random values will not affect the result of the protocol, since they are not in the vertex intersection. As a result, by counting the number ciphertexts received in Step 3, the server can only learn the upper bound of the vertex number of the client’s graph, i.e., .

6.3.2. PGU Leakage

Similar as PGI, PGU also leaks partial information about the input graphs during the process, which is modeled as and .

In Step 2, the server can utilize the same technique, as introduced above, to hide the exact vertex number of his graph, and the client can only learn the upper bound instead, i.e., .

In Step 3, in order to hide the exact vertex number of the client’s graph, the client generates encryptions of zero and sends the ciphertexts along with . An encryption of zero in Step 3 indicates that a vertex in the client’s graph also exists in the server’s graph. In later steps, extra encryptions of zero will not affect the final result, since the vertex union between the two input graphs will remain the same. As a result, the server can only learn the upper bound of the vertex number of the client’s graph, i.e., , and the upper bound of the common vertex number, i.e., .

In addition, we consider the case where the server sends a graph with small size to the client in Step 2. If the server’s graph is small enough, i.e., only 1 vertex and no edge, the union of the graphs will be almost the graph of the client. To prevent the server from learning the client’s graph in such a method, there are two points where the client can choose to end the protocol.

The first point is at Step 3. If the client receives a very small polynomial, the client can choose to end the protocol, and at this point, the server has not learned anything yet. However, if the server uses the technique stated above, the polynomial that the client receives will not give the exact size of the server’s graph. In this case, the client can check if the vertex union received in Step 8 is almost the same as his vertex set . If , it means either the server has a very small graph or the vertices in both graphs are highly overlapping. At this point, the client can choose to end the protocol; however, the server has already learned the vertex set of the client.

7. Experiments

In order to evaluate the performances of the proposed PGI and PGU protocols, we implement the protocols and perform experiments over the Enron email dataset. All the experiments were conducted on two PCs with Intel Core i7-2600 4.2 GHz CPU, 16 GB RAM, and Windows 10 operating system. (Due to the COVID-19 crisis, we cannot access the lab in the university at the moment, which contains the environment and equipment to perform the experiments on real mobile devices. As a result, the experiments are performed on a PC in this paper, and we will improve the experiments on mobile devices in later works.). The protocols are implemented in Python 3.6, and we used the phe library for the Paillier cryptosystem with a 1024-bit key length.

7.1. Dataset

The Enron email dataset is publicly available from the Stanford SNAP website (https://snap.stanford.edu/data/). The dataset contains email communications of around half a million emails. In order to convert the dataset to a graph, the senders and the receivers of the emails are represented as vertices, and if vertex sends at least one email to vertex , there exists an undirected edge between and . The resulting graph has 36,692 vertices and 183,831 edges. In addition, each vertex of the graph is represented as a unique integer.

7.2. Evaluation of PGI

When evaluating the performance of PGI, we randomly generate two subgraphs from the Enron email graph dataset and assign them to the server and the client, respectively. For each experiment, we set and to have the same value, and they increase from 1,000 to 10,000. Furthermore, the graphs of the server and the client are generated following the rule that 5% of the vertices are the same between the two graphs. Figure 2(a) shows the computation time for the server and the client.

As analyzed before, the computation costs for the server and the client are and , respectively, where is the number of vertices in the intersection of and . Therefore, the most dominant part of the computation costs for both the server and the client is most likely to be the number of common vertices between and . As shown in Figure 2(a), the computation time for both the server and the client grows quadratically as the number of common vertices increases. The detailed computation time for each step is shown in Table 2.

As shown in Table 2, the most time-consuming parts of PGI are Steps 5 and 6. In Step 5, the server performs Paillier encryptions, and in Step 6, the client performs homomorphic multiplications. Since the computations for both Steps 5 and 6 are highly parallelizable, the computation time can be greatly reduced if cluster computing is deployed.

The communication costs of PGI for both the server and the client are shown in Figure 2(b). As analyzed before, the total communication cost is . As a result, the communication costs have a quadratic growth in the figure. In addition, the communication costs are nearly the same for both the server and the client, and the overall communication cost for PGI is practical for the experimental dataset.

7.3. Evaluation of PGU

When evaluating the performance of PGU, we first randomly generate a subgraph from the Enron email graph dataset as the graph union . Then, we randomly choose two subgraphs of and assign them to the server and the client, respectively. The numbers of the vertices in the subgraphs are 60% of the vertex number in ; therefore, both and will have the same value. For each experiment, the number of vertices in increases from 50 to 500. Figure 3(a) shows the computation time for the server and the client, and the detailed computation time for each step is shown in Table 3.

As analyzed before, the computation costs for PGU are and for the server and the client, respectively, where is the number of vertices in the union of and . Therefore, similar as PGI, the most dominant part of the computation costs for both the server and the client is most likely to be the number of vertices in .

As shown in Table 3, most of the computation time is spent in Steps 7 and 8. In Step 7, the server performs Paillier encryptions, and in Step 8, the client performs Paillier encryptions and homomorphic additions and multiplications. Similar as before, the above computations are highly parallelizable, and cluster computing will greatly optimize the computation time.

As shown in Figure 3(b), the communication cost of PGU is similar to PGI, and the communication costs for both the server and the client have a quadratic growth as the number of vertices in increases. For our experimental dataset, the overall communication cost for the PGU protocol is also practical.

8. Conclusion

In this work, we proposed two privacy-preserving graph operation protocols between two parties, which can be used for secure authentication for smart devices. The first protocol, PGI, allows a server and a client to jointly compute the intersection between their private graphs, while the second protocol, PGU, computes the union of the graphs. The protocols first use polynomial representation and oblivious polynomial evaluation to compute the intersection and union of the vertices. The intersection and union of the edges are then computed by using an additive homomorphic cryptosystem.

We proved that the proposed protocols are secure in the semihonest security model. In other words, a semihonest client learns nothing about the server’s graph and a semihonest server learns nothing about the client’s graph. We analyzed the leakages during the protocols for both the server and the client and modeled the leakages as leakage functions. At last, we implemented the constructions of the protocols and evaluated the efficiencies over real-word graph data.

Data Availability

The graph data used to support the findings of this study can be found at https://snap.stanford.edu/data/.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work is supported by the National Natural Science Foundation of China (61872069) and the Fundamental Research Funds for the Central Universities (N2017012).