Computational fuzzy extractors☆
Introduction
Authentication requires a secret drawn from some high-entropy source. One of the primary building blocks for authentication is reliable key derivation. Unfortunately, many sources that contain sufficient entropy to derive a key are noisy and provide similar but not identical secret values at each reading. Examples of such sources include biometrics [2], measurements of capacitance [3], timing [4], motion [5], and quantum information [6].
Fuzzy extractors [7] derive reliable keys from noisy sources (see [8], [9], [10], [11] for applications of fuzzy extractors). The primitive consists of two algorithms: Generate (used once) and Reproduce (used subsequently). The Generate () algorithm takes an input w and produces a key r and a public value p. The Reproduce () algorithm is able to reproduce r given p and some value that is close to w (according to some predefined metric, such as Hamming distance). Crucially for security, knowledge of p should not reveal r; that is, r should be uniformly distributed conditioned on p. This feature is needed because p is not secret: for example, in a single-user setting (where the user wants to reproduce the key r from a subsequent reading ), it would be stored in the clear; and in a key agreement application [8] (where two parties have w and , respectively), the natural solution is to send p between the parties. (More techniques are possible when interactive communication is permitted; see Dupont et al. for a recent example [12].)
Fuzzy extractors use ideas from information-reconciliation and privacy amplification [6] and are defined (traditionally) as information-theoretic objects. Privacy amplification is usually performed with a randomness extractor [13]. Randomness extractors are well-understood [14]. Polynomial-time constructions of randomness extractors can extract randomness from all distributions with min-entropy with the help a short uniform nonsecret seed. A single randomness extractor simultaneously works for all probability distributions with sufficient entropy. Furthermore, for randomness extractors, the parameter gap between negative results, nonconstructive positive results, and polynomial-time constructions is relatively small.
Unfortunately, the state of fuzzy extractors is murkier. There is no crisp characterization of when key derivation is possible. Fuller, Reyzin, and Smith [15], [16] present one possible notion called fuzzy min-entropy. They show a non-polynomial-time algorithm that derives a key from each distribution with fuzzy min-entropy. Woodage et al. [17] subsequently improved the parameters. As a negative result, Fuller, Reyzin, and Smith [15], [16] and Fuller and Peng [18] show families of distributions where no fuzzy extractor can simultaneously work for the whole family, despite the fact that a fuzzy extractor exists for each element of the family. Thus, two main open areas of research for information-theoretic fuzzy extractors are providing polynomial-time constructions and providing constructions that simultaneously secure many distributions. This work asks:
Can computational security close these gaps?
We consider the sketch-then-extract paradigm used in most fuzzy extractor constructions. This paradigm combines a secure sketch and a randomness extractor. A secure sketch is a one-round information-reconciliation protocol. It allows recovery of the original value w from any nearby value . A randomness extractor is then run on w to produce uniform bits. One could replace the usual, information-theoretic, randomness extractor with a computational one [19], [20], [21] (constructed, for example, by applying a pseudorandom generator to the output of an information-theoretic extractor), but a computational extractor helps only if the conditional min-entropy of w conditioned on the sketch is high enough (else, the computational extractor has no security). Since the security losses due to secure sketches are usually much higher than due to randomness extraction, the secure sketch becomes the bottleneck.
We ask if a computational secure sketch can overcome information-theoretic lower bounds. The most natural relaxation of the min-entropy requirement of the secure sketch is to require HILL entropy [22] (namely, that the distribution of w conditioned on the sketch be indistinguishable from a high-min-entropy distribution). Under this definition, one could use a randomness extractor to obtain r from w, resulting in a pseudorandom key.
Negative result We prove in Theorem 3.6 that the entropy loss of such computational HILL secure sketches is subject to coding bounds that are similar to the ones that constrain information-theoretic secure sketches. More precisely, for every secure sketch that retains m bits of computational entropy, there is an error-correcting code with codewords. This error-correcting code can then be used to instantiate an information-theoretic secure sketch.
The idea is that, by definition of HILL entropy, an adversary should not be able to distinguish a pair from where x is drawn from a distribution with actual entropy conditioned on p. For most points close to w, the output of . Thus, the same must be true for points x drawn conditioned on a given p (or else we could build a distinguishing adversary), forcing the distribution of x conditioned on p to function as an error-correcting code.
Alternative computational definitions for secure sketches We define computational secure sketches via HILL entropy. A natural question is whether a weaker definition of security for secure sketches could avoid the negative result. A minimum condition is computational unpredictability of w given p [23]. If such a definition is used, one can instantiate sketch-and-extract with a reconstructive extractor [24], [23] (one way to build such an extractor is via repeated, independent applications of the Goldreich-Levin hardcore function [25]). Constructing secure sketches with computational unpredictability of w given p, or proving negative results about them, is a fascinating open problem.
Let us briefly discuss two other alternative definitions of pseudoentropy, called inaccessible entropy [26], [27] and next-block pseudorandomness [28], [29]. Inaccessible entropy measures the difference between the entropy of w conditioned on p and the ability of an adversary to find other values that are consistent with p. Since inaccessible entropy is bounded above by actual entropy it is not clear how to adapt this tool.
Next-block pseudorandomness asks that the distribution of each symbol of w is indistinguishable, conditioned on , from some distribution such that the sum of the conditional entropies of is high enough. Next-block pseudorandomness is used in building pseudorandom generators from one-way functions. It may be possible to build a good fuzzy extractor from this definition by modifying the subsequent extraction procedure, perhaps using techniques from [28], [29]. However, it may be that secure sketches based on this indistinguishability-style definition are subject coding-theory bounds similar to those for secure sketches based on HILL entropy, and this definition will not lead to improved constructions. Resolving these questions is another fascinating open problem.
For now, to avoid our negative result, we focus on directly constructing a computational fuzzy extractor. That is, in our construction, we will show that the output key r is indistinguishable from uniform (conditioned on p). To avoid the negative result for secure sketches, the pair must be one-way in the value w.
Positive result We construct the first fuzzy extractor whose security relies on computational security arguments (Juels and Sudan suggested using computational security in [30]). The construction can derive a key r whose length is at least the entropy of the source w. Our construction is for the Hamming metric and uses the code-offset construction [31], [7, Section 5] used in prior work, but with two crucial differences. First, the key r is not extracted from w like in the sketch-and-extract approach; rather w “encrypts” r in a way that is decryptable with the knowledge of some close (this idea is similar to the way the code-offset construction is presented in [31] as a “fuzzy commitment”). Our construction uses private randomness within , which is allowed in the fuzzy extractor setting but not for noiseless randomness extraction. Second, the code used is a random linear code, which allows us to use the Learning with Errors (LWE) assumption due to Regev [32], [33], [34] and derive a longer key r.
For security, we rely on the result of Döttling and Müller-Quade [35], which shows the hardness of decoding random linear codes when the error vector comes from the uniform distribution, with each coordinate ranging over a small interval. This allows us to use w as the error vector, assuming it is uniform. There have been subsequent works on uniform error LWE [36], [37]; however as we discuss in Section 4.2, these changes do not substantively effect our parameters. We also use a result of Akavia, Goldwasser, and Vaikuntanathan [38], which says that LWE has many hardcore bits, to hide r.
Because we use a random linear code, our decoding is limited to guessing a subset of locations and checking if it contained errors. Unfortunately, we cannot utilize the results that improve the decoding radius through the use of trapdoors (such as [32], [34]), because in a fuzzy extractor, there is no secret storage place for the trapdoor (in particular, cannot pass a secret to ). If improved decoding algorithms are obtained for random linear codes, they will improve the error-tolerance of our construction. However, the problem of generally decoding random linear codes is NP-hard [39].
The construction is secure whenever w is drawn from an error distribution that makes the decisional version of the LWE problem hard. Toward this end, we show the hardness of LWE when some dimensions of the error vector are fixed (and adversarially known), which may be of independent interest (Theorem 5.2). This allows w to come from a symbol-fixing source [40] (each dimension is either uniform or fixed).
Subsequent to the introduction of computational fuzzy extractors in the conference version of this work [1], other works built computational fuzzy extractors for noisy sources for which no efficient information-theoretic construction is known (e.g., [41]). Under strong cryptographic assumptions (semantically secure graded encodings), a polynomial-time computational fuzzy extractor exists for every source where the distance metric is computable in the complexity class [42].
A desirable property for fuzzy extractors is reusability [43], which guarantees that a user can securely enroll the value w with multiple independent providers to get values . Even with noise between different enrollments, each key should be private conditioned on the rest of the values. Boyen showed strong negative results on information-theoretically secure reusable fuzzy extractors [43].
Apon et al. [44] showed that the construction presented in this paper achieves a weak form of reusability if it is modified so that the random code is a global parameter (instead of being created as part of ). They also show how to augment the reusability using either a random oracle or LWE-based symmetric encryption techniques. Other subsequent work used different cryptographic techniques to construct reusable computational fuzzy extractors [41], [45], [46], [47].
Our security arguments are based on the learning-with-errors assumption with . Herder et al. [48] present a similar construction when that reduces to a form of learning parity with noise [49]. Herder et al.'s construction is secure when the bits of w are independent Bernoulli trials. They also show security when w comes from a class of affine transformations [48, Section 7]. Lastly, Huth et al. [50], [51], [52] implemented our construction on multiple devices, including a constrained 8-bit microcontroller.
The same authors published a conference version of this work in Asiacrypt 2013 [1]. That work did not include proofs or a detailed discussion of parameters. The theorem statement and the underlying proof in Section 3 had a minor error (pointed out by Yasunaga and Yuzawa [53]). This version corrects the theorem statement and proof. There was also a second negative result for secure sketches that is superseded by a more recent result in [15]; this is discussed in Section 3. Additionally, the conference version focused on extracted key length for high-entropy inputs as the sole reason to move to computational security. Since the conference version, it became evident that there are other important reasons. In particular, there is a large gap between known negative results for information-theoretic fuzzy extractors and positive constructions. There are many sources of practical importance, such as the iris [54] and physical unclonable functions [55], for which the best known information-theoretic fuzzy extractors provide little or no security. Since the publication of the conference version of this paper, computational constructions [55], [54] have been able to provide meaningful, albeit modest, security for such sources, while adding additional properties such as reusability. Lastly, this version discusses more recent results on uniform-error LWE and their applicability to our setting (in Section 4.2).
Section snippets
Preliminaries
For a random variable where each is over some alphabet , we denote by . The min-entropy of X is and the average (conditional) min-entropy [7, Section 2.4] of X given Y is The statistical distance between random variables X and Y with the same domain is . For a distinguisher D (or a class of distinguishers ) we write the computational distance between X and Y as
Impossibility of computational secure sketches
In this section, we show that a sketch that retains HILL entropy implies a large error-correcting code. For inputs that have full entropy this immediately implies a sketch that retains nearly the same amount of min-entropy. HILL entropy is a commonly used computational notion of entropy [22]. It was extended to the conditional case by Hsiao, Lu, Reyzin [23]. Here we recall a weaker definition due to Gentry and Wichs [57] (the term relaxed HILL entropy was introduced in [58]); since we show
A computational fuzzy extractor based on
As stated in the introduction, our construction of a computational fuzzy extractor treats the input w (drawn from the source W) as the noise term added to a codeword of a random linear code. Thus, the security of our construction depends on the distribution given by W. In this section we consider a uniform source W; we consider other distributions in Section 5. Our construction uses the code-offset construction [31], [7, Section 5] instantiated with a random linear code over a finite field .
Extending to nonuniform sources
We note that Construction 4.1 is secure whenever the source W is an LWE admissible distribution, meaning that using W as the error vector in LWE makes decoding/distinguishing computationally hard. (The instance has to be sufficiently hard for there to be a large number of hardcore bits.) Towards this end, we show hardness of LWE when a small number of dimensions of the error vector are fixed. We recall the notion of a symbol fixing source (from [40, Definition 2.3]): Definition 5.1 Let be a
Declaration of Competing Interest
In addition to affiliations with UConn, Amazon and Boston University, the authors have previously been affiliated with MIT Lincoln Laboratory, Apple, MIT, IST Austria, and Algorand. Other than potential organizational conflicts of interest, the authors foresee no potential conflicts.
Acknowledgements
The authors are grateful to Jacob Alperin-Sheriff, Ran Canetti, Yevgeniy Dodis, Nico Döttling, Daniele Micciancio, Jörn Müller-Quade, Chris Peikert, Oded Regev, Adam Smith, and Daniel Wichs for helpful discussions, creative ideas, and important references. In particular, the authors thank Nico Döttling for describing his result on LWE with uniform errors.
This work is supported in part by National Science Foundation grants 0831281, 1012910, 1012798, and 1849904. The work of Benjamin Fuller is
References (74)
- et al.
Computational fuzzy extractors
How iris recognition works
IEEE Trans. Circuits Syst. Video Technol.
(January 2004)- et al.
Read-proof hardware from protective coatings
- et al.
Physical unclonable functions for device authentication and secret key generation
- et al.
Shake them up!: a movement-based pairing protocol for CPU-constrained devices
- et al.
Privacy amplification by public discussion
SIAM J. Comput.
(1988) - et al.
Fuzzy extractors: how to generate strong keys from biometrics and other noisy data
SIAM J. Comput.
(2008) - et al.
Secure remote authentication using biometric data
- et al.
Non-malleable extractors and symmetric key cryptography from weak secrets
- et al.
Privacy amplification with asymptotically optimal entropy loss