crying_computers_prep/notes.org
2019-12-29 22:05:49 +01:00

507 lines
64 KiB
Org Mode
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

* Protocols with Trusted Dealer (week 2 & 3)
** Curriculum
- Model (trusted dealer).
- Protocols with passive security (One-time truth-tables, passive BeDOZa) and their security.
- Active attacks against OTTT/BeDOZa.
- Countermeasures (using information-theoretic MACs) and their security.
** Prelims
- The whole passive security thing where we define views and that a simulator has to be computationally indistingushiable from the views of all parties.
- The views contain the party input, internal randomness (if any), and all the messages received during the protocol
+ The messages sent do not need to be included, as these are a function of the randomness and the messages received
** Passively secure protocols
- A trusted dealer is essentially just a third-party who we assume we can trust
*** OTTT
- This protocols allows two parties to compute any function over their inputs. Thus: $f : \{0,1\}^n \times \{0,1\}^n \rightarrow \{0,1\}$.
- Rather than computing the function though, it's represented by a truth table (a matrix) $ T \in \{0,1\}^{2^n \times 2^n} $ where $T[i,j] = f(i,j)$.
- The ideal functionality is: Alice inputs $x \in \{0,1\}^{n}$ and Bob inputs $y \in \{0,1\}^{n}$. The protocols allows Alice to learn $z = f(x,y)$ and Bob should learn /nothing/.
**** The protocol
- *The trusted dealer* /D/ does the following:
1) Choose two shifts $r \in \{0,1\}^n$ and $s \in \{0,1\}^n$ *uniformly at random*
2) Choose matrix $M_B \in \{0,1\}^{2^n \times 2^n}$ *uniformly at random*
3) Compute a matrix $M_A$ s.t. $M_A[i,j] = M_B[i,j] \oplus T[i-r \mod 2^n, j-s \mod 2^n]$
4) Output $(r,M_A)$ to Alice and $(s, M_B)$ to Bob.
- *The actual protocol*:
1) Alice computes $u = x+r \mod 2^n$ and sends $u$ to Bob
2) Bob computes $v = y+s \mod 2^n$ and $z_B = M[u,v]$ and sends $(v,z_B)$ to Alice
3) Alice outs $z = M_A[u,v] \oplus z_B$
- *Correctness*: $$z = M_A[u,v] \oplus z_B = M_A[u,v] \oplus M_B[u,v] = T[u-r, v-s] = T[x,y] = f(x,y)$$
**** Security proof TODO: Figure out the simulator for Bob
- We need to construct a simulator that, given the input and output, is computationally indistinguishable from the views.
- The views for OTTT are defined as: $$ view_A = \{x, (r,M_A), (v, z_B) \} $$ and $$ view_B = \{y, (s, M_B), u \}$$
- According to the book, as this is a deterministic function, we can look at the distribution of outputs and the views separately.
- *Alice*:
1) The simulator gets as input $x \in \{0,1\}^n$ and $z \in \{0,1\}$
2) Sample uniformly a random $z_B \in \{0,1\}$, a random $v \in \{0,1\}^n$ and a random $r \in \{0,1\}^n$
3) Construct M_A : \forall (i,j) \neq (x+r,v): choose $M_A[i,j] \in \{0,1\}$ uniformly at random. Define $M_A[x+r,v] = z \oplus z_B$.
- In both the simulated and real case, the values $r, v$ and $M_A[i,j]$ for all $(i,j) \neq (u,v)$ are chosen uni-randomly.
- The pair $(M_A[u,v], z_B) is in both cases a pair of random bits s.t. $z = M_A[u,v] \oplus z_B$
+ This is an application of the "principle of deferred decision", i.e. it does not matter which order the random elements are sampled
- Has unconditional security, kind of the OTP of MPC
- Optimal round complexity, as it's only one message per party
- $2n+1$ bits are send, which is $1$ bit off the optimal communication complexity, which is $2n$
- Has a bad storage problem, as each party needs to receive an exponential of $n$ bits from the trusted dealer
+ One could gen $M_B$ or $M_A$ using a pseudorandom generator. This forces the protocol to only have computational security from the previous unconditional, but the storage complexity of one of the parties can be made small, while the other still needs exponential.
*** BeDOZa
- Circuit based, still using a trusted dealer
- More complicated, as it has to support different operations, XOR and AND.
- *Circuit Notation*: A circuit is a function $C : \{0,1\}^n \times \{0,1\}^n \rightarrow \{0, 1\}$ where the first input bits comes from Alice and the seconds from Bob. Gates have unbounded fanout. No cycles are allowed.
- *Invariant*: The protocol works on secret shared bits.
- Has $6$ different layers or wires!
1) Input Wires: For each of the /n/ wires belonging to Alice: Alice samples a random bit $x_B \leftarrow \{0,1\}$, sets $x_A = x \oplus x_B$ and then sends $x_B$ to Bob. $x$ is said to be "shared" or "x-in-a-box", using notation $[x] \leftarrow Share(A,x)$. Bob is symmetric to this.
2) Output wires: If Alice (resp. Bob), is supposed to learn [x], Bob sends $x_B$ to Alice who can then output $x = x_A \oplus x_B$. $(x, \perp) \leftarrow OpenTo(A, [x])$. $x \leftarrow OpenTo([x])$ is written, if both is to learn.
3) XOR with Constant: Write $[z] = [x] \oplus c$ for a unary gate computing $z = x \oplus c$ for some constant bit $c \in \{0,1\}$. In reality, Alice computes $z_A = x_A \oplus c$ while Bob simply sets $z_B = x_B$.
4) AND with Constant: $[z] = [x] \cdot c$. Same as above, kinda, but multiply. Both Alice and Bob multiplies their share by c now. $z_{A,B} = x_{A,B} \cdot c$.
5) XOR of Two Wires: $[z] = [x] \oplus [y]$. Alice computes $z_A = x_A \oplus y_A$, Bob computes $z_B = x_B \oplus y_B$.
6) AND of Two Wires: The only part which requires a trusted dealer. $[z] \leftarrow EvalAnd([x], [y])$, computes $z = x \cdot y$.
1) Dealer outputs a random triple $[u], [v], [w]$ s.t. $w = u \cdot v$.
2) Run subprotocol: $[d] = [x] \oplus [u]$
3) Run subproto: $[e] = [y] \oplus [v]$
4) Run subproto: $d \leftarrow Open([d])$
5) Run subproto: $e \leftarrow Open([e])$
6) Run subproto: $[z] = [w] \oplus e \cdot [x] \oplus d \cdot [y] \oplus e \cdot d$
- *Putting all of this together*: The circuit has /L/ wires; $x^1, ..., x^L$, there is only one output wire; $x^L$. First Alice and Bob run the subproto Share for each of the 2n input wires in the circuit; Then for each layer in the circuit $1,\dots,d$ alice and bob securely evaluate all gates at that layer using the subprotos XOR and AND and gates can only get input from gates at a lower level. Eventually, the value at the output wire will be ready and it can be opened to Alice, $(x, \perp) \leftarrow OpenTo(A, [x^L])$.
- OT can be used to remove the trusted dealer from BeDOZa.
**** Analysis TODO!
- We consider only semi-honest (or passive) at this point and this function is deterministic, so it's enough to prove that the output is correct and the view of a corrupted party can be simulated.
- *Correctness*: All gates are trivially correct, apart from AND: $$ w \oplus e \cdot x \oplus d \cdot y \oplus e \cdot d = uv \oplus (xy \oplus vx) \oplus (xy \oplus uy) \oplus (xy \oplus vx \oplus uy \oplus uv) = xy$.
- *Simulation of the view of a corrupted Alice, having only access to her input/output*:
1) For each invocation of $[x^i] = Share(x^i, A)$, the simulator (like an honest Alice would), samples random $x^i_B$ and sets $x^i_A = x^i \oplus x^i_B$.
2) For each invocation of $[x^i] = Share(x^i, B)$, the simulator includes in the view a message from Bob with a random bit $x^i_A \leftarrow \{0,1\}$.
3) When $[x^k] = [x^i] \oplus [x^j]$ is invoked, the sim (like an honest Alice) computes $x^k_A = x^i_A \oplus x^j_A$; (Simulation for XOR with constant and AND with constant is done similarly)
4)
** Active Security
- Message authentication codes!!
*** OTTT with MACs
- A malicious Bob can deviate from the original OTTT protocol in a few ways:
1) Bob can send the wrong value $v'$, rather than $v = y+s$. This means that Bob sends some arbitrary $v' \in \{0,1\}^n$. However, this can be seen as input substitution, since it's based on $y$, which is a value only Bob knows regardless.
+ Since $v' = (y+diff) + s$ s.t. $y+diff = y'$, where $y'$ could might as well have been the original input.
2) Bob sends /nothing/ or /an invalid message/. This will happen if Bob either does not send anything or Bob sends a pair which is not the right format; i.e. $(v', z'_B) \not\in \{0,1\}^n \times \{0,1\}$. + The second condition can be checked by Alice and the first can be solved by adding a timer at which point Alice will time out.
+ At this point, Bob has learned nothing but the value $u$, which is just a random value, as such we will not consider this cheating. + So we account for this by modifying the protocol in such a way that if Alice detects an invalid message or receives none, she simply outputs $z = f(x, y_0)$ where $y_0$ is just some default input. This can be computed efficiently in the simulated world by having the simulator give $y_0$ to the ideal world.
3) Bob /sends a wrong value/ $z'_B$. Since $z_B$ is the value we XOR with in the end, if it's flipped, Alice will get the wrong result, but will not know this.
+ Since $z'_B = z_B \oplus 1$ must be true, Alice will output $z' = z \oplus 1 = f(x,y) \oplus 1$. + This is *NOT* /input substitution/. If Alice and Bob were to compute $f(x,y) = 0$ for all values of $x$ and $y$, this would get fucked by Bob flipping his $z_B$, as Alice would always end up XORing $0$ and $1$, giving $1$ instead of $0$ as the result.
- Does, we need to defend us against the third case!
**** Intro to MACs
- Message authenticaion code
- Has three algos: (gen, tag, ver), where gen produces a MAC key k which can then be used to compute a tag on messages: $t = tag(k,m)$. The verification function ver(k,t,m) outputs accept if t is a valid tag for message m under key k and rejects otherwise.
+ Security of a MAC is defined as a game between a challenger C and an adversary A. C samples a key k and then A is alllowed to query q times for tags t_1,...,t_q on messages x_1, ...., x_q. The adversary then outputs a pair (t', x') for a message x' which he has not already queried for. A MAC scheme is then (q, eps) secure if the adversary is allowed <= q queries and his probability of t' being a valid tag for x' is >= eps.
**** Enhancing OTTT
- *The dealer* does the following:
1) Choose two shifts $r \in \{0,1\}^n$ and $s \in \{0,1\}^n$ *uniformly at random*
2) Choose matrix $M_B \in \{0,1\}^{2^n \times 2^n}$ *uniformly at random*
3) Compute a matrix $M_A$ s.t. $M_A[i,j] = M_B[i,j] \oplus T[i-r \mod 2^n, j-s \mod 2^n]$
4) Generate $2^{2n}$ keys for a $(1, \epsilon)$-secure MAC i.e. \forall i,j \in {0,1}^n define $K[i,j] \leftarrow Gen()$.
5) Generate MACs for values in $M_B$; i.e. \forall i,j \in {0,1}^{n} define $T[i,j] \leftarrow Tag(K[i,j], M_B[i,j])$.
6) Output $(r, M_A, K)$ to Alice and $(s, M_B, T)$ to Bob.
- *The protocol*:
1) Alice computes $u = x+r \mod 2^n$ again and sends this to Bob
2) Bob computes $v = y+s \mod 2^n$, $z_B = M_B[u,v]$ and $t_B = T[u,v]$ and then sends $(v, z_B, t_B)$ to Alice
3) If $ver(z_B, t_B, K[u,v]) = reject$ or no valid message is received, Alice outputs $f(x, y_0) = z$, else $z = M_A[u,v] \oplus z_B$.
**** Proving security
- This is no longer strictly deterministic, most likely since the MAC scheme fails with probability epsilon, thus we have to show this works for the joint probability of the views and the output.
- The proof is a reduction to breaking the underlying MAC scheme, if we can break the OTTT protocol.
- *The simulator*:
1) Sample random $s, M_B$, generate keys $K$ for the MAC scheme and compute MACs $T = Tag(K, M_B)$ (for all entrances) and send $(s, M_B, T)$ to Bob (so the simulator replaces the trusted dealer).
2) Sample a random $u$ and send it to Bob (replacing the honest Alice)
3) If Bob does not output anything or output an invalid message, or output a triple $(v', z'_B, t'_B)$ s.t. $z'_B \neq M_B[u,v]$, the simulator inputs $y_0$ to the ideal func. Else the sim inputs $y = v' - s$ to the ideal func.
- *Proof*:
1) /The view of Bob/ is distributed as the normal scheme, since $M_B, r$ are uni-random in both experiments and $u = x+r$ with random $r$ in the actual protocol, $u$ is also random.
2) The output of Alice is distributed identically, except for the case where corrupt Bob sends a triple s.t. $ver(K[u,v], t'_B, z'_B) = accept$ meanwhile $z'_B \neq M_B[u,v]$: in which case the real Alice would output $f(x,y) \oplus 1$ as previously discussed, but the ideal Alice would output $f(x,y_0)$. This Bob can however be turned into an adversary for the underlying MAC scheme, which is assumed to be secure, thus completing the proof.
**** Possible MAC scheme
- As Bob only sees a single MAC for each key, we can use a simple information theoretic MAC scheme which is simply a linear function:
1) $k \leftarrow Gen()$: Sample $k = (\alpha, \beta) \leftarrow Z^2_p$ for a prime $p$
2) $t \leftarrow Tag(k, m)$: Compute a tag $t$ on a message $m$ with key $k = (\alpha, \beta)$; $t = \alpha \cdot m + \beta$
3) $\{accept, reject\} \leftarrow ver(k,t,m)$: Output accept if $t' = \alpha \cdot m + \beta$, reject ow.
*** BeDOZa
- Works with an $m$-homomorphic MAC scheme, so likely just bring this up for the last topic, which only has one paper otherwise!
* Oblivious Transfer
** Curriculum
- Definition and applications.
- Protocols with passive security (from public-key encryption with random looking keys).
- The GMW compiler (3 steps: zero-knowledge proofs, coin-flip into the well, input commitment) and how to use it to construct active secure oblivious transfers - OR -
- Explicit constructions of active secure oblivious transfer in the common reference string (PVW protocol).
** Introduction
- Main variant of Oblivious Transfer (OT) is the /1-out-of-2/ OT or (2 1)-OT for short. It's described by the functional functionality:
1) Alice inputs a choice bit $b \in \{0,1\}$
2) Bob inputs two messages $m_0, m_1$
3) Alice learns $z = m_b$
- A secure OT should not let Alice learn anything about unchosen bit and Bob should not learn which bit Alice desires.
- /1-out-of-n/ is exactly what it sounds like. + /1-out-of-n/ OT directly implies single two-party secure computation for some function $f(x,y)$ for $x <= n$, as Bob can create his messages $M_0, ..., M_{n-1}$ as $M_i = f(i,y)$, using his own input $y$ and Alice will then use $x$ as the choice bit, giving her $M_i$ for $i=x$, $f(x,y)$.
- OT can be used to remove the trusted dealer from BeDOZa.
+ Specifically, for the multiplication (AND) gate, the trusted dealer had to samle bits $u_a,v_a,u_b,v_b,w_b$, had to compute $w_a = (u_a \oplus u_b) \cdot (v_a \oplus v_b) \oplus w_b and then send all the subscript $A$ to Alice and vice versa for Bob. + This dealer can be replaced by a (4 1)-OT protocol:
1) Alice samples random bits $u_A, v_A$ and inputs $i=2 \cdot u_a + v_a$ to the OT
2) Bob samples $u_B, v_B, w_B$ and inputs to the OT: $$M_0 = (0 \oplus u_B) \cdot (0 \oplus v_B) \oplus w_B$$ $$M_1 = (0 \oplus u_B) \cdot (1 \oplus v_B) \oplus w_B$$ $$M_0 = (1 \oplus u_B) \cdot (0 \oplus v_B) \oplus w_B$$ $$M_0 = (1 \oplus u_B) \cdot (1 \oplus v_B) \oplus w_B$$
3) Alice then sets $w_A = M_i$
- There exists something called *Random OT*, which is a randomized functionality where parties have no input. The functionality samples random bits $b,s_0,s_1$ and outputs $(b,z=s_b$ to Alice and $s_0,s_1$ to Bob.
** OT from Public Key Crypto
*** With passive security and pseudorandom public keys
- The Public-key Encryption Scheme (PKE) has pseudorandom public-keys. Then the following OT-protocol has passive security: + *Choose*: Alice (who has a choice bit b) generates a public key $pk_b$ with a secret key $sk_b$ and samples another random string $pk_{1-b}$ whose secret key she does not know. She sends $(pk_0, pk_1)$ to Bob.
+ *Transfer*: Bob (with messages $m_0,m_1$) creates two ciphertexts $c_0,c_1$ where $c_i$ is an encryption of $m_i$ under $pk_i$. He sends $(c_0,c_1)$ to Alice. + *Retrieve*: Alice can only decrypt $c_b$, as she only knows $sk_b$. She learns $m_b$.
- Since keys are pseudorandom, Bob can not distinguish between the fake PK and the real one. Alice does not know the sk for the fake pk and thus she can not decrypt the undesired message.
*** Passive security with oblivious key generation
- It is not required that public keys are pseudorandom, but merely that there should be an alternative way of generating the public-keys such that:
1) Public keys generated like this looks like regular pks
2) It is not possible to learn the sk corresponding to the pks generated like this.
+ Note that we can't simply let Alice erase the sk after having computed the pk. It might not be easy to securely erase data and there is no way to verify that Alice has properly erased it. A passive party has to follow the protocol correctly, but is still allowed to look at their view and learn from this. If they have a secret key, they are allowed to use this!
- A PKE with /oblivious key generation/ is a regular *IND-CPA-secure* PKE defined by three algos *Gen, Enc, Dec*, but with an additional algo /oblivious generation/ or *OGen*. OGen outputs strings which look like regular pks. OGen must satisfy:
1) Let b be a random bit, $pk_0 \leftarrow Gen(sk)$ is the regular pk gen algo and $pk_1 \leftarrow OGen(r)$ is the oblivious pk gen algo, where sk and r are chosen uni randomly. Then no PPT algo D, s.t. $D(pk_b) = b$ with prob larger than 1/2 (Should this not HAVE to be 1/2? Otherwise if it fails with larger prob, you can just reverse the answer???)
2) It is possible to *efficiently invert* $pk \leftarrow OGen(r)$, which is denoted $r \leftarrow OGen^{-1}(pk)$
- These two props imply it's infeasible to find sk corresponding to obliviously generated pks, even if you know the randomness used to generate it;
3) There exists no PPT algo A, which can output $sk \leftarrow A(r)$ s.t. $Gen(sk) = OGen(r)$.
- $OGen^{-1}$ must be able to "explain" real pks, as if they were generated by OGen, since a distinguisher can check if $$OGen(OGen^{-1}(pk)) = pk$$ This will apply to keys generated with OGen and thus it must also apply to keys generated by regular Gen, otherwise these two would not be indistinguishable. Therefore it must also hold that: $$OGen(OGen^{-1}(Gen(sk))) = Gen(sk)$$
- However, as (Gen, Enc, Dec) is a secure scheme, then it MUST be hard to compute sk from pk generated with Gen, so $pk \leftarrow Gen(sk)$ has to be a one-way function. Thus, a contradiction; if there is an A who can break property 3, then we can invert $pk <- Gen(sk)$ by computing $$sk <- A(OGen^{-1}(pk))$$.
- Thus, for the OT protocol to be secure, we need more. The encryption scheme is still IND-CPA secure, even if encryptions are performed using a pk which is the output from OGen, even if the adversary knows the randomness used by OGen to generate that specific key.
4) For all m; let b be a random bit, $m_0 = m$, $m_1$ is a random uniform message and $pk <- OGen(r)$, then there exists no PPT algo D s.t. $D(r, Enc(pk,m_b)) = b$ with prob significantly larger than 1/2. + This can be proven using property 1, 2 and that PKE is IND-CPA
*** Constructing Oblivious Key Gen
- Not all functions admit an oblivious sampler, which is due to http://cs.au.dk/~orlandi/asiacrypt-draft.pdf
- Think of a pseudorandom generator PRG, which expands n-bit long seed s into a 2n bit long string y=PRG(s). A trivial sampler for PRG is the identity function which maps 2n bit strings into 2n bit strings. This function is invertible and the security of the PRG implies that the output of the PRG is indistinguishable from uniformly random 2n-bit strings.
**** El Gamal sampler
- Given the description of a group where DDH assumption is believed to hold, (G,g,q), where g is gen for G and q is order of G, ElGamal is described:
+ *Gen(sk):* Input secret key $sk = \alpha \in Z_q$, compute h = g^{\alpha}$ and out $pk = (g,h)$ + *$Enc_{pk}(m)$:* Input message $m \in G$, sample random $r \in Z_q$, parse $pk = (g,h)$, output $C = (g^r, m \cdot h^r)$
+ *$Dec_{sk}(C)$:* parse C as $(c_1, c_2)$. Output $m = c_2 \cdot c_1^{-\alpha}$.
- We thus need to construct $OGen(r)$ which outputs $pk = (g,h)$ which is invertible and indistinguishable from $Gen$. If we pick a group where DDH is assumed to be hard, the multiplicative subgroup of order q of $Z^*_p$, where $p = 2q+1$. Now to gen a random element, we use the random string $r$ to sample $s$ between $1$ and $p$ and output $h = s^2 \mod p$. This process is invertible, since it is easy to compute square roots modulo a prime), and we can check that $h$ is distributed uni-randomly among elements of order q, as required.
*** The protocol
- *Choose:* Alice with choice bit b:
1) Samples random $sk, r$
2) Gens $pk_b <- Gen(sk)$, $pk_{1-b} <- OGen(r)$
3) Sends $(pk_0, pk_1)$ to Bob
- *Transfer Phase:* Bob with $m_0,m_1$:
1) Computes $c_0 = Enc_{pk_0}(m_0,r_0)$ and $c_1 = Enc_{pk_1}(m_1,r_1)$, using random $r_0,r_1$
2) Sends $(c_0, c_1)$ to Alice
- *Retrieve Phase:* Alice outputs $m_b <- Dec(sk, c_b)$
** The GWM-Compiler
- Uses two tools from CPT, commitments and ZK-PoK
*** Note on commitments
- Hiding, Binding
- Simple protocol based on DL:
1) R chooses random $h$ which gens $G$
2) S wants to commit to a value $x \in Z_q$. S chooses random $r \in Z_q$, computes $c = g^x h^r$ and sends $c$ to $R$
3) S sends $R$ $(x',r')$. If $c = g^{x'}h^{r'}$, $R$ outputs $x'$ or $turnstile$ ow.
- Is unconditionally hiding, as $h^r$ is uni random in $G$.
- Is comp binding as given two pairs $(x,r),(x',r')$, one can compute the DL of $h$ in base $g$.
*** Note on ZK
- Use ideal ZK-functionalities, i.e. a model where parties have access to ideal boxes which on common input $x$ and private input $w$ from one of the parties outputs $R(x,w)$. This kind of box can in practice be replaced with any of the ZK protocols for NP-complete languages from CPT. + Such as Graph Nonisomorphism
*** The compiler
- We can build passive OT using any PKE scheme with OGen
+ Is only passive though, as Alice could just sample both PKs using Gen. Bob would not know this, as OGen and Gen should be indistinguishable from each other.
- Simply adding a ZK proof that some $r$ is used to generate either key. + $x = (pk_0,pk_1)$, $w=r$, the relation would accept if $(pk_0 = OGen(r) \text{ or } pk_1 = OGen(r))$. This can be attacked, by computing $pk_0 = Gen(sk_0), pk_1 = Gen(sk_1), r=OGen^{-1}(pk_0)$ s.t. $pk_0 = OGen(r)$, and then use $r$ for the ZK proof.
- Issue arises as Alice can choose her own randomness, however, on the other hand we can not let Bob choose this for her either.
- *Coin flip into the well* is a fix for this. Essentialy, it allows Bob to participate, without having access to the end result. Bob will choose some $r_B$ and Alice will choose some $r_A$ and these can be XORed, when Bob sends $r_B$ to Alice.
1) Alice choose random string $r_A$ and randomness $s$ for the commitment scheme. She computes $c = Com(r_A, s)$ and sends $c$ to Bob.
2) Bob choose random $r_B$ and sends this to Alice.
3) Alice computes $r = r_A \oplus r_B$ and $s$, Bob has $(c,r_B)$ in the end.
- Finally, $r$ is random, $c$ is hiding, so Bob can not base $r_B$ on $r_A$ and c is also binding, so Alice can not choose a new $r_A$.
- Now Alice can compute $pk_b = Gen(sk)$, $pk_{1-b} = OGen(r)$ using the $r$ from the coinflipping protocol. Alice will then send $(pk_0, pk_1)$ to Bob and Alice and Bob will then use ZK box for the following relation: $x = (pk_0, pk_1, c, r_B)$, witness $w = (r,s)$ and the relation outputs $1$ if: $c = Com(r \oplus r_B, s) \text{ and } (pk_0 = OGen(r) \text{ or } pk_1 = OGen(r))$.
- This addition suffices to make the aforementioned protocol actively secure. It is not always enough though, as such, a third step might be required.
*** The fucking-up protocol
- Proto
1) Alice gens $pk^1_b = Gen(sk^1)$ and $pk^1_{1-b} = OGen(r^1)$ and sends $(pk^1_0, pk^1_1)$ to Bob
2) Bob computes $e^1_0 = Enc(pk^1_0, m_0), e^1_1 = Enc(pk^1_1, m_1)$. Sends $(e^1_0, e^1_1)$ to Alice
3) Alice gens $pk^2_b = Gen(sk^2)$ and $pk^2_{1-b} = OGen(r^2)$ and sends $(pk^2_0, pk^2_1)$ to Bob
4) Bob computes $e^2_0 = Enc(pk^2_0, m_0), e^2_1 = Enc(pk^2_1, m_1)$. Sends $(e^2_0, e^2_1)$ to Alice
5) Alice outputs $m_b = Dec(sk^1, e^1_b)$
- So the protocol simply runs two copies of the original protocol /using the same inputs/. This protocol is still secure against passive corruption. If we simply compile this the same way as before (so only two steps), we get something which is not actively secure:
1) Run coin-flip proto; Alice gets $(r,s)$, parse $r = (r^1, r^2)$, bob gets $(c, r_B)$
2) Alice gens $pk^1_{1-b} = OGen(r^1)$, sends ($pk^1_0, pk^1_1)$ to Bob.
3) Alice and Bob runs ZK proof where Alice proves that $(pk^1_0 = OGen(r^1) \text{ or } pk^1_1 = OGen(r^1))$
4) Bob computes and sends $(e^1_0,e^1_1)$
5) Alice gens $pk^2_{1-b} = OGen(r^2)$, sends ($pk^2_0, pk^2_1)$ to Bob.
6) Alice and Bob runs ZK proof where Alice proves that $(pk^2_0 = OGen(r^2) \text{ or } pk^2_1 = OGen(r^2))$
7) Bob computes and sends $(e^2_0,e^2_1)$
- Clearly not secure, as Alice can change which bit she is interested between step 2 and 5. This results in her having Bob encrypt both $m_0$ and $m_1$ using the actual encryption key, so she can learn both.
- Is fixed by having Alice also commit to her input bit.
1) Alice with input $b$ choosen random $r_A,s,t$, computes $c = Com(r_A,s)$ and $d = Com(b,t)$. Sends $(c,d)$ to Bob.
2) Alice and Bob perform ZK proof where $x = (c,d)$, $w = (b,r_A,s,t)$ and $R(x,w)=1$ if $c = Com(r_A,s) \text{ or } d = Com(b,t)$.
3) Bob choose $r_B$ and sends to Alice
4) Alice computes $r = r_A \oplus r_B$, choose random $sk$ and computes $pk_b = Gen(sk)$ and $pk_{1-b} = OGen(r)$. Sends $(pk_0,pk_1)$ to Bob
5) Alice and Bob perform another ZK proof; $x = (c,d,r_B,pk_0,pk_1)$, $w = (b,r_A,s,t)$, $R(x,w) = 1$ if: $c = Com(r \oplus r_B, s) \text{ and } d = Com(b,t) \text{ and } pk_{1-b} = OGen(r_A \oplus r_B)$. Note that the final and can be computed due to the simulator learning the witness.
6) Bob sends $(e_0,e_1)$ to Alice, $e_i = Enc(pk_i,m_i)$.
7) Alice outputs $m_b = Dec(sk, e_b)$.
- The protocol is proven secure against an actively corrupted Alice by constructing a simulator. Remember that in the hybrid model the simulator simulates all calls to the ZK box. In other words, every time Alice inputs something to the ZK box the simulator learns w. The simulator uses the corrupted A as a subroutine and works as follows:
1) S receives $(c,d)$ from A
2) S receives $w = (b,r_A,s,t)$ from A
3) S sends random $r_B$ to A
4) S receives $(pk_0, pk_1)$ from A
5) S receives $w' = (b', r'_A, s', t')$ from A
6) If $w' \neq w$ or $pk_{1-b} \neq OGen(r_A \oplus r_B)$ (which we have access to through the witness), then abort. O.w. sim inputs $b$ to the ideal func and learns $m_b$.
7) The sim computes $e_b = Enc(pk_b, m_b)$ and $e_{1-b} = Enc(pk_{1-b}, 0)$ (we don't know the other message, but since the Enc system is neat, we're guchi) and outputs the view for Alice $(b,t,r,s,r_B,e_0,e_1)$.
- Informally, there are only two differences from the view of the simulator and of Alice;
1) The simulation aborts if $w' \neq w$ while real proto continues as long as $Com(b',t') = Com(b,t)$ and $Com(r'_A,s') = Com(r_A,s)$. However, if this happens in the protocol, then A can be used to break the binding property of $Com$, thus reaching a contradiction of $Com$ being a nice.
2) In the sim, $e_{1-b} = Enc(pk_{1-b},0)$, while in the proto $e_{1-b} = Enc(pk_{1-b}, m_{1-b})$. If the distinguisher succeeds, then we can break the IND-CPA (chosen plaintext attack) security of the underlying scheme.
1) Receive pk from the ind-cpa challenger
2) Query the ind-cpa challenger with messages (case 0) $m_{1-b}$ and (case 1) $0$ and receive a challenge text $e^*$
3) Construct a simulated view where $r_B = r_A \oplus OGen^{-1}(pk)$ and $e_{1-b} = e^*$ and give these to the distinguisher. If the distinguisher thinks the view is from a real proto, guess case 0, o.w. guess case 1.
** Efficient Active Secure OT
- GMW compiler has massive overhead and leads to very inefficient protocols
- Thus, we have to come up with another way of making sure that Alice does not know the secret key of the fake public key.
*** El Gamal based encryption scheme
- (G,g,q) is a group where DDH is hard
- *Key gen*: PKs are for group elements $(g,h,u,v)$ and sk is $x \in Z_q$ s.t. $u = g^x$, $v = h^x$.
- *Enc*: To encrypt $m \in G$, choose random $r,s \in_R Z_q$ and out $(c,d) = (g^r h^s, m \cdot u^r v^s)$
- *Dec*: $m = d \cdot c^{-x}$, so like El Gamal
- What on the face of earth is this proof? Something with no one knowing if the pk is a DDH tuple
*** The active secure OT proto
- We assume existence of a trust dealer who can output a common random string to both parties. This dealer can be replaced with a secure coin-flipping protocol
- *Setup*: Use group $(G,g,q)$ with order $q$, gen by $g$. Trusted Dealer outputs four random group elements: $crs = (g_0,h_0,g_1,h_1)$ to Alice and Bob
- *Choose*: Alice with input $b \in \{0,1\}$, samples $x \in_R Z_q$, computes $(u,v) = (g^x_b, h^x_b)$, sends $(u,v)$ to Bob.
- *Transfer*: Bob with $m_0, m_1 \in G$ defines $pk_0 = (g_0, h_0, u,v)$ and $pk_1 = (g_1,h_1,u,v)$, samples $r_0,s_0,r_1,s_1 \in_R Z_q$ and computes: $e_0 = Enc(pk_0,m_0;r_0,s_0)$ and $e_1 = Enc(pk_1,m_1;r_1,s_1)$ and sends these to Alice
- *Retrieve*: Alice computes $m_b = Dec(x,e_b)$.
- Neither Alice nor Bob can determine if $(g_0,h_0,g_1,h_1)$ is a DDH tuple or not
- if $(g_0,h_0,g_1,h_1)$ is not a DDH tuple, then at most one of $pk_0,pk_1$ is a DDH tuple, which implies that the encryption using the wrong key is information-theoretically secure.
- If $(g_0,h_0,g_1,h_1)$ is a DDH tuple, then $(u,v)$ perfectly hides choice bit $b$; if $(g_0,h_0,g_1,h_1)$ is a DDH tuple, then $h_0 = g^\alpha_0$ and $h_1 = g^\alpha_1$, with the same $\alpha$, meaning that given $(u,v)$ there exists $x_0,x_1$ s.t. $(u,v) = (g^{x_0}_0,h^{x_0}_0) = (g^{x_1}_1,h^{x_1}_1)$
*** Formal proof
- When Alice is corrupted:
1) The sim choose $crs$ as a non-DDH tuple, it gens $(g_0,h_0,g_1,h_1)$ s.t. $h_0 = g^{\alpha_0}_0$, $h_1 = g^{\alpha_1}_1$, s.t. $\alpha_0 \neq \alpha_1$.
2) The sim extracts input bit of Alice from $(u,v)$ by finding $b$ s.t. $v = u^{\alpha_b}$ and if no such $b$ exists s.t. the relation holds, set $b=0$.
3) The sim inputs $b$ to the ideal func and learns $m_b$, defines $m_{1-b} = 0$. Sim sends $(Enc(pk_0,m_0), Enc(pk_1,m_1))$ to corrupt Alice
- This sim is statistically indistinguishable from the protocol as most $crs$ are not a DDH tuple, in which case encryptions under $pk_{1-b}$ are perfectly secure
- When Bob is corrupted:
1) The sim choose $crs$ as a DDH tuple, it gens $(g_0,h_0,g_1,h_1)$ s.t. $h_0 = g^{\alpha}_0$, $h_1 = g^{\alpha}_1$, and $g_1 = g^\beta_0
2) The sim computes random $x_1$ and $(u,v) = (g^{x_1}_1, h^{x_1}_1)$. The sim also defines $x_0 = x_1 \cdot \beta$, now $(u,v) = (g^{x_0}_0, h^{x_0}_0)$ due to the definition of $x_0$.
3) The sim receives $(e_0,e_1)$ from corrupt Bob, compute $m_0 = Dec(x_0,e_0)$ and $m_1 = Dec(x_1, e_1)$ and inputs them to ideal func.
- View of Bob in sim is computationally indistinguishable from proto, only diff is the distribution of the crs.
- Showing that output of the honest Alice in the sim, is identical to the output of the honest alice in the proto, is left out of the proof.
* Garbled Circuits
** Curriculum
- Definition and high-level view.
- How to garble individual gates.
- Application to constant-round secure two-party computation with passive security.
- Attacks against the passive secure protocol (garbling wrong functions, selective failure attacks) and (brief) sketch of possible countermeasures (simple cut-and-choose strategy).
** Intro
- The topic of this week is “garbled circuits” and how they can be used (together with oblivious transfer), to achieve passive secure two-party computation with only constant round, aka “Yaos protocol” (as opposed to the BeDOZa-protocol, where the number of rounds was proportional to the circuit depth).
- A garbling scheme is a crypto primitive which allows to evaluate encryptions functions on encrypted inputs.
- Garbled circuits allow to perform secure comp in constant rounds and Yao (combined with a 2-message OT proto) use only two rounds.
+ The 2-rounds is crucial for some applications, as it allows a user A to pubish some kind of encryption of her input and then anyone else can send her a single message that allows her to recover the result of the computation
** Abstract Desc of Garbling Circuits
# TODO: Is this a spelling mistake, Ev ev? - NO!
- A garbling scheme is defined by a tuple $G = (Gb, En, De, Ev, ev)$ + Gb is the garbling circuit generation function. It is randomized and on input $1^k$ plus the description of a boolean circuit $f : \{0,1\}^n \rightarrow \{0,1\}$ (while both $n$ and $|f|$ must be polynomially bounded in $k$), outputs a triple of strings $(F,e,d)$ which represents the garbled circuit, the encoding information and the decoding information.
+ $ev$ allows the evaluation of the plaintext circuit $f$, so $ev(f,x) = f(x)$, i.e. it takes $f$, which is the ungarbled function, and some input $x$ and runs $f$ on $x$. + $En$ is the encoding function. It's deterministic and uses $e$ (the encoding information) to map some input $x$ to a garbled circuit $X$. A simple garbling scheme for $2PC$ is a projective scheme where $e = (\{K_i^0, K_i^1\}_{i \in [1,n]})$ and the garbled input $X$ is then simply $\{K_i^{x_i}\}_{i \in [1,n]}$, so the encoding information is a set of $n$ pairs of keys, one for each possible value for each bit of the input and the garbled input $X$ is then simply a subset of those strings.
+ $Ev$ is the garbled evaluation function. It is deterministic and it evaluates a garbed circuit F on garbled input X and it produces a garbled output Z'. + $De$ is the decoding function, it uses decoding information $d$ and it decodes the garbled output $Z'$ into some plaintext output $z$. In the projective schemes, $d$ consists of two strings, $d = (Z_0, Z_1)$ and $De$ then outputs $z = 0$ if $Z' = Z_0$, $z = 1$ if $Z' = Z_1$ and otherwise $\perp$ if $Z' \not\in \{Z_0, Z_1\}$, i.e. if some error occured.
+ Correctness is defined like so: Let $G$ be a garbling schemes. $G$ then enjoys correctness if for all $f : \{0,1\}^n \rightarrow \{0,1\}$ and all $x \in \{0,1\}^n$ the following probability is negligible:
$$Pr(De(d,Ev(F,En(e,x))) \neq f(x) : (F,e,d) \leftarrow Gb(1^k, f))$$
I.e. the probability that the decoding function run on the evaluation function, which is run on the garbled circuit and the garbled input, is not the result of the plaintext function simply run on the ungarbled input x, should be very low.
*** Formal description of passive 2PC based on garbled circuits
- 2PC = 2-party computation
- So projective garbling schemes as described above can be used to accomplish 2PC like so:
1) Setup: Alice and Bob have inputs $x,y \in \{0,1\}^n$ and wishes to compute some function $f(x,y)$ where $f : \{0,1\}^{2n} \rightarrow \{0,1\}$ describes a circuit where the first $n$ inputs come from Alice and the last $n$ come from Bob. We split $e = (e_x, e_y)$ to divide the input encoding information in a part belonging to Alice and one to Bob.
2) Garble: Bob will run $(F, e_x, e_y, d) \leftarrow Gb(1^k, f)$ and send $F$ to Alice. Note that $e_x$ and $e_y$ do not by themselves describe the input, they are just used to encode the input bits later.
3) Encode Bob's Input: Bob runs $Y \leftarrow En(e_y, y)$ and sends $Y$ to Alice. So Bob sends his garbled input to Alice, which will at this point simply be a list of keys.
4) Encode Alice's Input: Alice and Bob run a secure 2PC where Bob inputs $e_x$, Alice inputs $x$ and Alice learns $X \leftarrow En(e_x, x)$. For the projective scheme where $e_x = \{K_0^i, K_1^i\}_{i \in [1,n]}$, Bob knows these $n$ keys or strings, and Alice has her bit vector $x = \{0,1\}^n$ and Alice should end up with $X = \{K_{x_i}^i\}_{i \in [1,n]}$ where $x_i$ is the $i$'th input bit. This can easily be accomplished by running $n$ OT protocols, where alice will input the bit she is interested in and Bob will input both of the keys corresponding to the i'th bit.
5) Evaluation: Alice can now compute $Z \leftarrow Ev(F,X,Y)$ + Bob output: Alice sends $Z$ to Bob who can output $z \leftarrow De(d,Z)$
+ Alice output: Bob sends $d$ to Alice who can output $z \leftarrow De(d,Z)$
- Which sub bullet of step 5 is run depends on who is to get the output of the function.
- Messages 1,2,5b can be sent together. Also, if one uses a 2-messages OT protocol in step 3, the protocol can be compressed to two messages. In the first message Alice sends to Bob the first message of the OT protocol, in the second message Bob sends Alice the second message of the OT protocol together with 1,2,5b.
- Security: The protocol is only secure against passive corruption, i.e. you get nothing from simply seeing the views. If we consider the protocol where alice gets output, so the one using 5b, we can simulate it like so: Bob is trivial, as Alice does not send anything to Bob, other than the OT protocols. So, what does Bob see? Bob has to compute the garbled circuit as this has to be sent to Alice. I imagine this can simply be random though, as well, it is garbled. Furthermore the sim has the output bit already, so we can just compute $d$ to be what we need I guess. In regards to a simulation of Alice. Given the input and output of Alice, x and $z = f(x,y)$, but not $y$, the sim garbles the circuit $f'$ that always outputs $z$. Now the sim creates a view using a dummy input $x$ and $y$ and includes $F', X', Y', d$ in the view (and also sims the OTs). The indistinguishability between the real view and the ideal view of the adversary is due to the following property of garbling schemes:
+ (Garbled Circuit Privacy) Let $G$ be a garbling scheme. $G$ then enjoys privacy if there exists a sim $S$ s.t. for all $f : \{0,1\}^n \rightarrow \{0,1\}$ and all $x \in \{0,1\}^n$, the output of the sim on input the structure of the circuit (what?) $struct(f)$ and the output $(F', X', d') \leftarrow S(1^k, struct(f), f(x))$ is indistinguishable from $(F,X,d)$ where $(F, e,d) \leftarrow Gb(1^k, f)$, $X \leftarrow En(e,x)$.
- Essentially, a garbled circuit and a garbled input will only leak info on the output of the circuit evaluated on the garbled input as well as some partial information on the structure of the circuit computing $f$, $struct(f)$. This structure includes the number of gates, the number of wires, the number of input and output bits and the wiring of the circuit, but can still hide all other information about the input and the function which is being computed. Essentially, everything about the circuit is leaked, except what function is computed. One can however hide the structure of the circuit, by applying a standard transformation to the circuit (garbling the universal circuit (whatever this means), which takes as input the description of a circuit of size $n$ and evaluates it, which will only an upper bound on the size of the circuit). This is used for private function evaluation, where Alice knows $x$, Bob knows $f$, alice should learn $f(x)$ without learning $f$.
** TODO: The actual protocol
*** Circuit Notation
- Use notation $f : \{0,1\}^n -> \{0,1\}$ for a circuit. A circuit has T wires. The wires $w_i$ with index $i \in [1,n]$ are called input wires. The wire $w_i$ with index $i = T$ is the output wire and all other wires $w_i$ with $i \in [n+1, T-1]$ are internal wires. We can asssume the circuit is composed of (so all internal + output wire) NAND gates, as we can build any function from NAND gates. Thus, for all $i \in [n+1, T], w_i = \not (w_{L(i)}\ AND\ w_{R{i)})$, where $L : [n+1, T] -> [1,T]$ and $R : [n+1, T] -> [1,T]$ are functions specifying left and right input wire of the gate. It must be true that $L(i) <= R(i) < i$, since the value on each wire is a function of the values of wires with lower indices. The input wire values are given as input and are not computed from other values. As such, $L$ and $R$ are not defined for these.
*** Underlying Crypto Primitives
- Only require a pseudorandom function: $G : \{0,1\}^k x \{0,1\}^k x [1,T] -> \{0,1\}^{2k}$ (It should be efficient and be difficult to distinguish from random stuff).
- We then use this PRF: $G(A,B,i)$ where $A$ and $B$ are two $k-bit$ keys and $i$ is the input. As a minimum requirement, we need that as long as at least one of the two keys is unknown to the adversary, then the output of the PRF is unpredictable.
*** Circuit generation
- To generator garblec circuit $(F,e,d)$ from some boolean circuit $f$ with $T$ wires, the algo $Gb$ is defined like so:
1) For each wire $i \in [1,T]$ in the circuit: Choose two random strings $(K^i_0, K^i_1)$ <- \{0,1\}^k x \{0,1\}^k$. T is the index of the output wire, then $d = (Z_0, Z_1) = (K^T_0, K^T_1)$. For $n$ input wires; $e = \{K^i_0, K^i_1\}_{i \in [1,n]}$.
2) For $i \in [n+1, T]$ (all internal wires), define a garbled table $(C^i_0, C^i_1, C^i_2, C^i_3)$ like the following:
a) For all $(a,b) \in \{0,1\} x \{0,1\}$ compute: $$C^'_{a,b} = G(K^{L(i)}_a, K^{R(i)}_b, i) XOR (K^i_{\not (a\ AND\ b)}, 0^k)$$ (the not AND in the subscript means taking the NAND of the two values a and b, thus, if we use other gates than NAND, we should perform this computation instead)
b) Choose random perm $\pi : \{0,1,2,3\} -> \{0,1\} x \{0,1\}$ and add: $$(C^i_0, C^i_1, C^i_2, C^i_3) = (C^i_{\pi(0)}, C^i_{\pi(1)}, C^i_{\pi(2)}, C^i_{\pi(3)})$$ to F.
*** Encoding
- The function $En(e,x)$ parses $e = \{K^i_0, K^i_1\}_{i \in [1,n]}$ and outputs $X = \{K^i_{x_i}\}_{i \in [1,n]}$.
*** Evaluation
- The function $Ev(F,X)$ parses $X = \{K^i_{x_i}\}_{i \in [1,n]}$ and does for all $i \in [n+1, T]$:
1) Recover $(C^i_0, C^i_1, C^i_2, C^i_3)$ from F. (Essentially just fetch them, so Alice finds the circuit F and loops through it
2) For $j = 0,1,2,3$ compute: $(K'_j, \tau_j) = G(K^{L(i)}, K^{R(i)}, i) XOR C^i_j$
3) If there is a unique $j$ s.t. $\tau_j = 0^k$, define $K^i = K'_j$, else abort and output $\perp$. So essentially we try each value for $C$ (as we do not know the permutation), until we find one which gives a "correct" value. Keep in mind that the values of $C$ are already "evaluated" in the circuit generation, so we essentially just pick the correct evaluations now.
- Output $Z' = K^T$.
*** Decoding
- The function $De(d,Z)$ parses $d = (Z_0, Z_1)$ and outputs $0$ if $Z' = Z_0$, $1$ if $Z' = Z_1$, else $\perp$.
*** Correctness
Why is the $0^k$ there to look for, to find the correct gate evaluations (We append $0^k$ in the generation step, to the key we XOR with)? During the garbling, the circuit generator encrypts the output keys of each gate under all four combinations of the input keys, then during evaluation, the circuit eval only has two of the input keys, as there is one for each wire and will need to find the correct key corresponding to the output wire for the right value. Thus, the eval dude knows that $K^{L(i)} = K^{L(i)_a$ for $a \in \{0,1\}$, but does not know $a$, and the same for $b$, but he still needs to compute the output key $k^i_c$ for $c= \not (a\ AND\ b)$. This is however possible due to the $0^k$ added in the gen step, as we will only find this redundancy if the eval dude decrypts the right entry in the garbled table, as he will then find $(K, 0^k)$, since: $$G(K^{L(i)}_a, K^{R(i)}_b, i) \oplus C^i_{\pi^{-1}(a,b)} = (K^i_{\not(a \and b)}, 0^k)$$.
If however $\pi(j) = (a', b') \neq (a,b)$, then:
$$G(K^{L(i)}_a, K^{R(i)}_b, i) \oplus C^i_j = G(K^{L(i)}_a, K^{R(i)}_b, i) \oplus G(K^{L(i)}_{a'}, K^{R(i)}_{b'}, i) \oplus (K^i_{\not(a' \and b')}, 0^k)$$
and intuitively, since $G$ is PRF, we expect the last $k$ bits of $$G(K^{L(i)}_a, K^{R(i)}_b, i) \oplus G(K^{L(i)}_{a'}, K^{R(i)}_{b'}, i)$$ to be different than $0^k$ with high probability, when $(a,b) \neq (a', b')$, so we will notice when the decryption isn't successful, allowing us to simply check all possibilities.
*** Privacy (Why we need the PRF)
Alice shold never be allowed to learn both keys for any wire. By construction, this holds for input wires (as Alice only ever recieve a single key) and it is also true for the internal wires.
- Assume for wire $i$ that Alice knows $K^{L(i)}_1,, K^{R(i)}_1$ (Note that for most internal wires Alice will not know the values corresponding to his keys, in this case (1,1), but for gates
connected only to input wires Alice might actually know these values.), and let her know the perm $\pi$. Alice must then not be able to compute $K^i_1$, since she does not know $K^{L(i)}_0,, K^{R(i)}_0$, she does not know at least $k$ bits for all other evaluations of $G$ and therefore we expect all the other $C^i_j$ s.t. $\pi(j) \neq (1,1)$ to be indistinguishable from random, in Alice's view (As $G$ is PRF). So we need for all $i$ that
$$(i, K^{L(i)}_1, K^{R(i)}_1, G(K^{L(i)}_0, K^{R(i)}_0, i), G(K^{L(i)}_0, K^{R(i)}_1, i), G(K^{L(i)}_1, K^{R(i)}_0, i))$$
and
$$(i, K^{L(i)}_1, K^{R(i)}_1, s_1, s_2, s_3)$$ are indistinguishable for $s_1, s_2, s_3 \in_R \{0,1\}^{2k}.$$
*** Privacy (Why we need the perm)
If the permutation did not happen, then during eval of any given gate, Alice learns $j$ s.t. $\tau_j = 0^k$. If Alice knew the permutation for that wire, Alice would learn the values of the input wires to that gate neing $\pi(j) = (a,b)$, which would be quite unfortunate.
*** Privacy definition
We defined privacy by asking that there exists a sim that on input $struct(f)$ and $z = f(x)$ can simulate $F, X, d.
1) Parse $struct(f)$ as the number of wires $T$, the number of inputs $n$ and the wiring functions $L$, $R$.
2) Choose $T+1$ random keys $K^i \in \{0,1\}^k$. Define $X = \{K^i\}_{i \in [1,n]}$, define $d = (Z_0, Z_1) = (K^T, K^{T+1}$
3) For $i \in [n+1, T]$ (all internal wires), choose random $h \in \{0,1,2,3\}$ and three random strings $C^i_k <- \{0,1\}^{2k}$ for $j \in \{0,1,2,3\}, j \neq h$. Define:
$$C^i_h = G(K^{L(i)}, K^{R(i)}, i) \oplus (K^i, 0^k)$$
for all $i \in [n+1, T-1]$ and define $$C^T_j = G(K^{L(T)}, K^{R(T)}, T) \oplus (K^{T+z}, 0^k)$$ (What is this $T+z$, should it be something different??)
So the sim creates a garbled circuit s.t. there exists a single key for each write and each garbled table contains a single encryption and 3 random ciphertexts. Note there is something different for the final gate, as we need to make sure it outputs the right value as defined by the decoding table $d$.
** Active Security
- Yao's protocol is only passively secure
*** Active Alice
- An actively corrupted Alice can't do much, as long as the OT protocol is actively secure. This is due to:
+ Garbled Circuit Authenticity: $G$ is said to enjoy authenticity if for all PPT $A$ (adversaries?), for all $f: \{0,1\}^n \rightarrow \{0,1\}$ and all $x \in \{0,1\}^n$: $Pr(De(d,A(F,X)) \not\in \{f(x), \perp\} < negl(k)$, where $(F,e,d) \leftarrow Gb(1^k, f)$, $X \leftarrow En(e,x)$. + This essentially states that an adversary on input a garbled circuit $F$ and a garbled input $X$, can only compute the right garbled output or compute something which will fail (the $\perp$). This implies that even an actively corrupted Alice can't cheat Yao's protocol, i.e. Alice won't be able to affect the output of the evaluation function beyond computing an error. This property is also used in Verifiable Delegation of Computation.
- In the protocol as described above, a corrupted Alice will see the garbled circuit F before she has to choose her input and feed it into the OT protocol. This allows her to pick her input based on F such that she can perform so called "selective opening attacks". (TODO: Perhaps write up the rest of this, I don't really care ..)
This can be easily fixed by changing the order of the messages in the protocol, so instead of sending $F,Y$ and then running the OT, you can simply run the OT first.
*** Active Bob
Bob can attack this protocol in several ways. He can change the function and encrypt a different $f^*$, than the one they had originally agreed on. Due to the privacy properties of the garbled circuit, Alice has no idea which function is garbled, so she can't tell $F$ and $F^*$ from each other, as long as they have the same structure. Particularly, $F^*$ could be a function allowing Bob to pick the output or make it leak something regarding the input of Alice.
*** Cut-and-choose
Similiar to what is done in ZK proofs. The protocol has Bob garble the function $f$ multiple times, say s, with independently chosen randomness and then send $F_1, \dots, F_s$ to Alice. Alice will then choose at random some of these garbled circuits and have Bob provide her with the randomness he used to generate them, s.t. Alice can verify it is the correct function he has garbled. (How is it proven that Bob did in fact choose the randomness correctly (the coin down a well thing??), because otherwise he could likely construct it such that two different randomness could construct the same circuit for the two different functions. This would completely break security, as the garbled circuit could now "open" to 2 different functions.. Do note that it would likely require a fuckton of compute power to pull this off, so it might just be written off as not plausible) Now, if Alice does not detect any cheating, she assumes the rest of the unopened circuits are correctly produced, so Alice and Bob will evaluate the remaining circuits. The amount evaluated allows for two main classes of cut-and-choose protocols:
1) "All-but-one" cut-and-choose: Alice chooses one index $j \in [1,s]$ at random. Bob sends the randomness used to garble all circuits $F_i$ s.t. $i \neq j$. If Alice does not find any function $f^*$ s.t. $f \neq f^*$, the protocol proceeds with the remaining circuit. This stills allows Bob to cheat with prob $1/s$, as he can garble a single malicious circuit and pray that Alice picks this. Thus, the security is only $1 - (1/s)$, which is not negligible.
2) "Majority" cut-and-choose: Alice chooses a subset $J \subset [1,s]$ of size $|J| \approx \frac{s}{2}$, leaving $|J|$ circuits to be checked, afterwards the protocol proceeds with evaluating the remaining unopened $s/2$ circuits. If the circuits output different values, Alice outputs the value output by the majority. This solves the "single malicious circuit" problem. However, even if Alice does detect any cheating, she does not complain to Bob. Complaining opens up for a /selective-failure/ attack: Bob garbles two functions $f_1, f_2$ s.t. they output same value if the first bit of $x$ is $0$ and different values otherwise. Now, depending on if Alice abort/complains or not, Bob will learn one bit of Alices input.
There is one last problem with this majority protocol. Now that Alice and Bob evaluate $s/2$ different circuits, we need to make sure that the inputs used in all evaluations are the same. Otherwise then Alice or Bob can learn the evaluation of the function on up to $s/2$ different outputs, which is an issue, as $f(x,y_1), \dots, f(x,y_{s/2})$ might leak more information than only running the protocol ones on $f(x,y)$.
*** Another selective failure attack
Bob can use the "wrong" input during the OT phase. Assume that for some position $i$, Bob inputs to the OT the pair $(m_0, m_1) = (K^i_0, R)$ for some random string $R$. Now, if the i-th bit of Alice's input is equal to $0$, Alice will learn $K^i_0$ from the OT, which allows correct evaluation, otherwise, she will get a random string and the $Ev$ algo will fail and Alice will abort. By observing whether Alice aborts or not, Bob will learn input bit $y_i$.
** Verifiable Delegation of Computation Using Garbled Circuits
A computationally limited client (such as a smartphone) wants to delegate the computation of a circuit $f$ on input $x$ to a powerful server. The client is afraid that the server might produce a wrong output. Thus, the client wants to verify that the output $z$ that he receives from the server is such that $z = f(x)$. The naive solution would be to compute $f(x)$ yourself, so we wish a solution faster than this. Garbled Circuits can be used to achieve a partial solution, delegation with preprocessing:
1) Preprocessing: Client computes $(F,e,d) <- Gb(f, 1^k)$ and sends $F$ to the server.
2) Online phase: When input $x$ becomes available, the client generates $X <- En(e,x)$ and sends $X$ to the server. The server computes $Z <- Ev(F,X)$ and returns $Z$ to the client, who outputs $z <- De(d,Z)$.
Due to authenticity, the client is sure that if $z \neq \perp$, then $z = f(x)$. The running time of online phase is independent of circuit size (size of $f$).
However, the complexity of preprocessing (running time of $Gb$) is proportional to $|f|$, which is why this is only a partial solution.
* Homomorphic Encryption
- Definition and applications.
- Pailler encryption (additively homomorphic).
- Bootstrapping: from bounded homomorphic encryption to unbounded homomorphic encryption.
- (Brief) Example of bounded homomorphic encryption scheme "with noise"
** Pailler's Cryptosystem
- Variant of the RSA encryption scheme, uses modulo $n^2$ rather than $n$.
- Allows to encrypt messages in $Z_n$.
- Uses two mappings from $Z_n$ to $Z_{n^2}$.
+ First mapping takes care of messages and ensures the scheme is homomorphic + Second takes care of randomness and makes sure scheme is secure
+ The two mappings can be combined in $Z_{n^2}$.
- The first mapping is an additively homomoprhic mapping of elements of $Z_n$ into $Z_{n^2}$. + $a(x) = (1 + x*n) \mod n^2$ + TODO: Solve exercise to show this is homomorphic
+ Also this has an inverse s.t. for all elements of $Z_{n^2}$ in the image of $a$, it is possible to compute $x$ in $(1+x*n) \mod n^2$. (It is an exercise to show what it is)
- The second mapping, b, also maps from elements in $Z_n$ to $Z_{n^2}$, an it is essentially the same as RSA encryption function, but modulo $n^2$ and public exponent $e$ is fixed to $n$:
+ $b(x) = x^n \mod n^2$ + TODO: Solve exercise showing this is multiplicatively homomorphic. I'd imagine it's the same as RSA
- Security of Pailler encryption is based on the assumption that it is hard to distinguish between uniformly random elements in $Z_{n^2}$ and the results of $b(x)$. Which is also just the semantic proof for RSA
- To this end, it is very easy to dinstinguish, if you know the factorisation of $n$ (as is also true for RSA), since you can compute $\phi(n) = (p-1)(q-1)$ if $pq = n$ which allows you to compute if a given ciphertext $y$ is within the image of $b$:
+ $y^{\phi(n)} = 1 \mod n^2$ + If $y = b(x)$, then: $b(x)^{\phi(n)} = (x^n)^{\phi(n)} = x^{n * \phi(n)} = x^{\phi(n^2)} = 1 \mod n^2$
+ This follows from you having found the inverse, if you can factor $n$. Also, since every element of a group raised to the group order is $1$. However, most elements of $Z_{n^2}$ will not give $1$ if raised to $\phi(n)$.
*** The construction
Keygen:
- Sample primes $p,q$ of equal length.
- Set $n = pq$
- Output $sk = \phi(n) = (p-1)(q-1)$ and $pk = n$
Encryption:
- Given $m \in Z_n$, sample random $r \in_R Z^*_n$
- Outputs $C = a(m) * b(r) = (1 + m*n)*r^n \mod n^2$
Decryption:
- Compute $m = a^{-1}(C^{sk} \mod n^2) * sk^{-1} \mod n$
Correctness:
- Follows from the exercises showing that $a$ is invertible and where you find the specific function
- Perhaps the inverse of $a$ is just $(x-1)/n
Security:
- Follows from $b(r)$ and $Z_{n^2}$ being indistinguishable
- The scheme is additively homomorphic, thus: + $E(m_1, r_1) * E(m_2, r_2) = a(m_1)b(r_1)a(m_2)b(r_2) = a(m_1 + m_2)b(r_1 * r_2) = E(m_1+m_2, r_1*r_2) \mod n^2$
*** Using in arithmetic bedoza-protocol
- Can be used to replace dealer in bedoza with a two-party multiplication protocol based on Pailler cryptosystem, if we use BeDOZa over $Z_n$ where $n$ is the public parameter of Pailler and only Alice knows the factorisation of $n$.
- If we recall how mul works in BeDOZa:
+ Alice has $(x_A, y_A) \in Z_n \times Z_n$ and Bob has $(x_B, y_B) \in Z_n \times Z_n$ and in the end Alice and Bob has $(z_A, z_B) \in Z_n \times Z_n$ s.t. $z_A * z_B = (x_A+x_B)*(y_A+y_B) = (x_A*y_A+x_A*y_B+x_B*y_A+x_B*y_B$
- So Alice can compute $x_A*y_A$ locally and Bob can compute $x_B*y_B$, but they will need to interact to compute $x_A*y_B$ and $x_B * y_A$. To compute the shares of $x_A*y_B$:
1) Alice sends $U = E_n(x_a)$ to Bob
2) Bob samples a random $s_B \in_R Z_n$ and computes $V = U^{y_B} * E_n (-s_B) \mod n^2$ and then sends $V$ to Alice
3) Alice computes $s_A = D_{\phi(n)} (V)$ (whatever $D_{\phi(n)}$ means..
- Due to homomorphic property of $E$, it holds that $s_A + s_B = x_A*y_B \mod n$. Shares of $x_B*y_A$ can be computed similarly. + Is passively secure, as the simulator for a corrupted Bob would set $U = E(0)$ and it will be indistinguishable due to semantic security of Pailler. For corrupted Alice the simulator would encrypt $V = E(s_A)$ and this is distributed like $V$ in the real protocol.
+ Bob learns nothing about $x_A$ from $U$, unless he breaks security of Pailler. Alice learns nothing about $y_B$ because given that $s_B$ is uniformly random, $s_A$ is uniformly distributed in $Z_n$, in Alice's view.
+ The protocol can be compilted to active security via ZK proofs in multiplication protocol and MACs to ensure authenticitiy.
** Fully Homomorphic Encryption Scheme
*** Defining FHE
- A FHE scheme is defined with four algorithms: Gen, Enc, Dec and Eval. As such, it is defined like a normal PKE scheme, but with an additional algo Eval.
- Eval takes a vector of ciphertexts and a function $f$ and outputs an encryption of the function applied to the plaintexts.
+ Let $(pk, sk) <- Gen(1^k)$, then correctness states: $Dec_{sk}(Eval(f, Enc_{pk}(x_1), ..., Enc_{pk}(x_n))) = f(x_1, ..., x_n)$
- Pailler is homomorphic with respect to any function $(f_{a,b}(x)$ of the form $ax+b \mod n$.
- A FHE scheme is homomorphic for any function $f$ defined as a circuit.
- We also define $d-HE$ scheme, which is allows to evaluate any function which can be expressed by a circuit of depth $d$.
- We require a property called "*compactness*"
+ This states that the algorithm Eval should be performing the computation of the function $f$, otherwise the following can happen + We simply let Eval append the function defined as a circuit, to the ciphertext. Then when we decrypt, we decrypt the ciphertext and then apply the function.
- The naive scheme has some issues though, which is not just the efficiency of it.
- If we want to use an FHE scheme to build a secure computation protocol, we tend to need Eval to hide as much info as possible about which function was computer.
+ This is apparent in the multiplication subprotocol used in BeDOZa above, where Alice must not learn the function applied by Bob on her input $(a,b)$. + This property is called *circuit privacy*.
- *Circuit privacy* states that: An encryption scheme $(Gen, Enc, Dec, Eval)$ has *circuit privacy* w.r.t a function family $F$, if for all $(pk, sk) <- Gen(1^k)$, plaintext $x$ and function $f \in F$, it holds that: $(sk, Enc_{pk}(f(x))) \approx_c (sk, Eval_{pk}(f, Enc_{pk}(x)))$
+ This essentially means that it should be easily distinguishable if the function is applied before encryption or after encryption.
- Yao's garbled circuits with a 2-message OT essentially gives FHE encryption with a reasonable level of privacy. This makes sense I guess, since we can compute any function on any input, we do however get a bit back. Hmm. + These do not satisfy *compactness* however
*** Bootstrapping
- Central idea of FHE is bootstrapping.
- Let's assume we are given a d-HE scheme $(Gen, Enc, Dec, Eval)$, where Eval can evaluate any circuit of depth at most $d$. Then, if we try to evaluate a circuit of depth $> 0 $, then the $Dec$ algorithm is not guaranteed to work. Furthermore, assume Dec can be described a circuit of depth $d'$ such that $d' < d$, then we can construct FHE schemes in which we can evaluate circuits of any polynomial depth.
- We need to introduce *level* of a ciphertext. A ciphertext has level 0 if it is the output of the $Enc$ function. A ciphertext obtained using Eval a single time $c' = Eval_{pk}(f,c)$ has level $0 + j$ where $j$ is the depth of the circuit $f$, and in general $c'' = Eval_{pk}(f,c')$ has level $i+j$ where $i$ is the level of $c'$.
- We can finally construct an FHE scheme: $(Gen, Enc, Dec, Eval)$, $Gen$ runs $(pk, sk) <- Gen(1^k)$, $SK = sk$, $PK = (pk, Enc_{pk}(sk))$ so we publish an encryption of the secret key. If this is secure, this is known as a "circular secure encryption scheme", not all semantic security schemes have this though + Imagine if we take an IND-CPA scheme but modify the Enc algo such that should you ever try to encrypt the secret key, it should just output this. We now have a scheme which is in all general purposes semantically secure, but is very much not circular secure.
- Enc and Dec behaves the exact same way as in the original scheme, if we can define Eval in such a way that it follows the technique of "refreshing". Take a ciphertext $C$ at some level $i$. Assume $i \leq d$, and therefore $C$ can be correctly decrypted to $m = Dec_{sk}(C)$. Now consider $f(x) = dec_{x}(C)$, which is the circuit that decrypts the ciphertext $c$ using input $x$. We have already assumed the decryption circuit has depth $d' < d$. Note that $C$ is not actually an argument to the circuit $f$ (since it only uses $x$), we can consider $C$ a part of the description of the circuit. Now, since $PK$ contains an encryption of $sk$, we can compute $C' = Eval(f, Enc_{pk}(sk))$, resulting in a ciphertext $C'$ of level $d' = 0 + d'$, since $Enc(sk)$ has level $0$ and $f$ has depth $d'$. So the level of $C'$ does not depend on the level of the original ciphertext $C$, which is $i$ (As assumed to begin with), thus if $d' < i$, then $C'$ is even a lower level than the original $C$. Now given that $d' < d$, we can decrypt $C'$: $Dec_{sk}(C') = f(sk) = Dec_{sk}(C) = m$, which holds since $i$ is assumed to be a lower level than $d$. So now we started with an encryption of $m$, $C$, and ended up with an encryption of $m$, $C'$. We have however decreased the level of it!
- Thus, we can construct an Eval function which takes any two encryptions of level less than or equal to $d$, containing two bits $a,b$ and outputs an encryption of the value $\not(ab)$ at level $d'$ (so the NAND function). Since the output of this Eval function has level $<= d$ as described above, the output can itself be fed into the Eval function, which means any boolean circuit can be evaluated in this way, as we can build boolean circuits of NAND gates computing any function.
- This Eval algo takes as input ciphers $C_1$ s.t. $Dec_{sk}(C_1) = a$ and $C_2$ s.t. $Dec_{sk}(C2) = b$ and it outputs a $C_3$ s.t. $Dec_{sk}(C_3) = \not(ab)$. The algo Eval is evaluated by applying the circuit: $f(x) = \not(Dec_x(C_1) \cdot Dec_x(C_2))$ on the encryption of the secret key using Eval. Now the depth of $f$ is $d' + 1 \leq d$, so the decryption is still correct, and we can evaluate any circuit gate by gate.
- Thus, we need to build a d-HE scheme!
*** Building a d-HE scheme
- Can be used as is. We might not need fully homomorphic schemes, if we know the circuits we want to evaluate always will be of depth $<= d$.
KeyGen:
- Choose a big secret odd integer $p$. Choose big random integers $q_1, ..., q_n$ and small integers $r_1, ..., r_n$. The secret key is $p$, the public key is $(y_1, ..., y_n)$, where $y_i = p * q_i + 2r_i$.
Encryption:
- To encrypt $m$, choose random subset $S \subseteq [1,n]$ and compute over these integers $c = m + \sum_{i \in S} y_i$
Decrypt:
- Compute $m = ((c \mod p) \mod 2)$
Correctness:
- $c \mod p = m + 2(\sum_{i \in S} r_i) + p(\sum_{i \in S} q_i) = m + 2(\sum_{i \in S} r_i) \mod p$
- If we have then set the parameters properly, s.t. $2(\sum_{i \in S} r_i)$ is less than $p$, it will not get removed from the mod, then: $(m + 2(\sum_{i \in S} r_i) \mod p) \mod 2 = m$.
Security:
- Related to the assumption that it is hard to recover $p$ from the public key. Without the error terms $r_i$, it would be trivial to compute $p$, bu simply computing the $gcd$ of all the $y_i$'s (as we multiply all of them with $p$ and the $q_i$'s are supposed to be big, it is very likely that the $p$ will be the smallest number that they all reduce to). It is however believed to make the problem difficult that the noise is added, since these are all supposed to be smaller than $p$, they are likely to be the result of the gcd. After noise is added, this becomes the approximate GCD problem, which no polynomial time algos exist for.
Homomorphism:
- The scheme is homomorphic as: If $C_1 = Enc_{pk}(m_1, S_1)$ and $C_2 = Enc_{pk}(m_2, S_2)$, then $C_1 + C_2 \mod p = (m_1 + m_2) + 2(\sum_{i \in S_1} r_i + \sum_{j \in S_2} r_j)$ and $C_1 * C_2 \mod p = (C_1 \cdot C_2) + 2(m_1 \cdot \sum_{i \in S_1} r_i + m_2 \cdot \sum_{j \in S_2) r_j + \sum_{i \in S_1, j \in S_2} r_i r_j)$, where the second term will just get removed by the mod 2 in the decryption.
- One issue with this however, is that every single homomorphic operation will make this noise grow, which is not exactly ideal.
Conclusion:
- This scheme is very simple. However, it is also not very secure unless the parameters are very large. In fact, to get approximately 60-bits worth of security (equivalent to an encryption of DES-60), we have to require that $p$ is roughly $2000$ bits long, that the $q_i$'s are $10^7$ bits long (1Mb), that the $r_i$'s are roughly 60 bits long and that there are roughly $2000$ values $y_i$ in the public key.
- Furthermore, asfter one encryption, the noise is approximately $2^{70}$ and so after just $4$ multiplications, the noise will be greater than $p$. This was however enough to implement the Blood Type handin!