You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

89 lines
7.6 KiB

  1. ## Lagrange Polynomial Interpolation and Shamir secret sharing
  2. *2021-10-10*
  3. > If you read this post, be aware that I’m not a mathematician, I’m just an amateur on math studying in my free time, and this article is just an attempt to try to sort the notes that I took while learning about Lagrange polynomial interpolation and Shamir's secret sharing.
  4. Imagine that you have a *secret* (for example a *private key* that can decrypt a file), and you want to backup that *secret*. You can split the *secret* and give each slice to a different person, so when you need to reconstruct the *secret* you just need to put together all the parts. But, what happens if one of the parts gets corrupted, or is lost? The secret would not be recoverable.
  5. A better solution can be done if we use *Shamir Secret Sharing*, which allows us to split the *secret* in $k$ different parts, and set a minimum threshold $n$, which defines the number of required parts to recover the *secret*, so just by putting together any $n$ parts we will recover the original secret.
  6. This has interesting applications, such as social recovery of keys or distributing a secret and ensuring that cooperation is needed in order to recover it. In the following lines we will overview the concepts behind this scheme.
  7. ### Lagrange polynomial interpolation
  8. Lagrange interpolation is also used in many schemes that work with polynomials, for example in [KZG Commitments](https://arnaucube.com/blog/kzg-batch-proof.html) (an actual implementation [can be found here](https://github.com/arnaucube/kzg-commitments-study/blob/master/arithmetic.go#L272)).
  9. The main idea behind is the following: for any $n$ distinct points over $\mathbb{R}^2$, there is a unique polynomial $p(x) \in \mathbb{R[x]}$ of degree $n-1$ which goes through all of them.
  10. From the 'other side' point of view, this means that if we have a polynomial of degree $n-1$, we can take $n$ points (or more) from it, and we will be able to recover the original polynomial from those $n$ points.
  11. We can see this starting with a line. If we are given any two points $P_0=(x_0, y_0)$ and $P_1=(x_1, y_1)$ from that line, we are able to recover the original line.
  12. <div style="text-align:center;">
  13. <img style="width:300px;margin-bottom:20px;" src="img/posts/shamir-secret-sharing/line.png" />
  14. </div>
  15. We can map this into the previous idea, seeing that our line is a degree $1$ polynomial, so, if we pick $2$ points from it, we later can recover the original line.
  16. Same happens with polynomials of degree $2$. Let $p(x)$ be a polynomial of degree $2$ defined by $p(x)= x^2 - 5x - 6$. We can create infinity of polynomials of degree $2$ that go through $2$ points, but with 3 points there is a unique polynomial degree $2$
  17. As the degree is $2$, if we pick $3$ points from the polynomial, we will be able to reconstruct it.
  18. <div style="text-align:center;">
  19. <img style="width:300px;margin-bottom:20px;" src="img/posts/shamir-secret-sharing/degree2.png" />
  20. </div>
  21. This is generalized by using *Lagrange polynomial interpolation*, which defines:
  22. For a set of points $(x_0, y_0), (x_1, y_1), ..., (x_n, x_n)$,
  23. $$
  24. I(x) = \sum_{i=0}^n y_i l_i(x)\newline
  25. where \space\space\space l_i(x) = \prod\_{0\leq j \leq n, j\neq i} \frac{x-x_j}{x_i - x_j}
  26. $$
  27. ### Shamir's secret sharing
  28. As we've seen, for a degree $n-1$ polynomial we can pick $n$ or more points and we will be able to reconstruct the original polynomial from it. This is the main idea used in *Shamir's secret sharing*.
  29. Let $s$ be our secret. We want to generate $k$ pieces and set a threshold $n$ which is the minimum number of pieces that are needed to reconstruct the secret $s$. We can define a polynomial of degree $n-1$, and pick $k$ points from that polynomial, so in this way with just putting together $n$ points of $k$ we will be able to reconstruct the original polynomial. And, we can place our secret $s$ in the *constant term* of the polynomial (the one that has $x^0$), in this way, when we reconstruct the polynomial using $n$ out of $k$ points, we will be able to recover the secret $s$.
  30. We can see this with an example with actual numbers (we will use small numbers):
  31. Imagine that we want to generate $5$ pieces from our secret, and define that just by putting together $3$ of the pieces we can recover the secret, this means setting $n=3$ and $k=5$. Then we will generate a polynomial of degree $n-1=2$, by $p(x) = \alpha_0 + \alpha_1 x + \alpha_2 x^2$, where $\alpha_0 = s$ (the secret).
  32. We will work over a finite field of size $p$, where $p$ is a prime number. For our example we will work over $\mathbb{F}_{19}$, in real world we would work with much more bigger field. You can find an [example without finite fields in Wikipedia](https://en.wikipedia.org/wiki/Shamir%27s_Secret_Sharing#Example).
  33. Let our secret be $s=14$. We now generate our polynomial of degree $n-1=2$, where $s$ will be the constant coefficient: $p(x)= s + \alpha_1 x^1 + \alpha_2 x^2$. We can set $\alpha_1$ and $\alpha_2$ into any random value, as example $\alpha_1=4$ and $\alpha_2=6$. So we have our polynomial: $p(x) = 14 + 4 x + 6 x^2$.
  34. Now that we have the polynomial, we can pick $k$ points from it, using incremental indexes for the $x$ coordinate: $P_1=(1, p(1)), P_2=(2, p(2)), \space\ldots\space, P_k=(k, p(k))$. With the numbers of our example this is (remember, we work over $\mathbb{F}\_{19}$):
  35. $$
  36. p(x) = 14 + 4 x + 6 x^2,\newline
  37. p(1)=14 + 4 \cdot 1 + 6 \cdot 1^2 = 24 \space (mod \space 19) = 5\newline
  38. p(2)=14 + 4 \cdot 2 + 6 \cdot 2^2 = 46 \space (mod \space 19) = 8\newline
  39. p(3)=14 + 4 \cdot 3 + 6 \cdot 3^2 = 80 \space (mod \space 19) = 4\newline
  40. p(4)=14 + 4 \cdot 4 + 6 \cdot 4^2 = 126 \space (mod \space 19) = 12\newline
  41. p(5)=14 + 4 \cdot 5 + 6 \cdot 5^2 = 184 \space (mod \space 19) = 13
  42. $$
  43. So our $k$ points are: $(1,5), (2,8), (3,4), (4,12), (5,13)$. We can distribute these points as our 'secret parts'.
  44. In order to recover the secret, we need at least $n=3$ points, for example $P_1$, $P_3$, $P_5$, and we compute the *Lagrange polynomial interpolation* to recover the original polynomial (remember, we work over $\mathbb{F}\_{19}$):
  45. $$
  46. I(x) = \sum_{i=0}^n y_i l_i(x) \space\space
  47. where \space\space\space l_i(x) = \prod\_{0 \leq j \leq n \\ j\neq i} \frac{x-x_j}{x_i - x_j}
  48. $$
  49. $$
  50. l_1(x) = \frac{x-3}{1-3} \cdot \frac{x-5}{1-5} = \frac{x-3}{17} \cdot \frac{x-5}{15}=\frac{x^2+11x+15}{8}\newline
  51. l_3(x) = \frac{x-1}{3-1} \cdot \frac{x-5}{3-5} = \frac{x-1}{2} \cdot \frac{x-5}{17} =\frac{x^2+13x+5}{15}\newline
  52. l_5(x) = \frac{x-1}{5-1} \cdot \frac{x-3}{5-3} = \frac{x-1}{4} \cdot \frac{x-3}{2} = \frac{x^2 + 15x + 3}{8}\newline
  53. $$
  54. $$
  55. I(x) = y_2 \cdot l_2(x) + y_4 \cdot l_4(x) + y_5 \cdot l_5(x)\newline
  56. = 5 \cdot (\frac{x^2+11x+15}{8}) + 4 \cdot (\frac{x^2+13x+5}{15}) + 13 \cdot (\frac{x^2 +15x + 3}{8})\newline
  57. = \frac{5x^2+17x+18}{8} + \frac{4x^2+14x+1}{15} + \frac{13x^2+5x+1}{8}\newline
  58. = 3x^2+14x+7 + 18x^2+6x+14 + 4x^2+3x+12\newline
  59. = 6x^2 + 4x + 14
  60. $$
  61. We can now take the *constant coefficient*, or just evaluate the obtained polynomial at 0, $p(0) = 6 \cdot 0^2 + 4 \cdot 0 + 14 = 14$, and we obtain our original secret $s=14$.
  62. ### Conclusions
  63. As an example of an use case of *Shamir Secret Sharing* we can think of social recovery of keys, there is an useful implementation of this scheme is used in the [banana split by Parity](https://bs.parity.io/). Also, here it is an implementation of the scheme in `Go`&`Rust` done a couple of years ago: https://github.com/arnaucube/shamirsecretsharing.
  64. *Lagrange Interpolation* in its own way, is a very useful tool in many schemes, it is also used in KZG Commitments, in zkSNARKs, zkSTARKs, PLONK, etc. In most of the schemes where polynomials are involved it becomes a very useful tool.