You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

113 lines
5.3 KiB

  1. # arbo [![GoDoc](https://godoc.org/github.com/vocdoni/arbo?status.svg)](https://godoc.org/github.com/vocdoni/arbo) [![Go Report Card](https://goreportcard.com/badge/github.com/vocdoni/arbo)](https://goreportcard.com/report/github.com/vocdoni/arbo) [![Test](https://github.com/vocdoni/arbo/workflows/Test/badge.svg)](https://github.com/vocdoni/arbo/actions?query=workflow%3ATest)
  2. > *arbo*: tree in Esperanto.
  3. MerkleTree implementation in Go. Compatible with the circomlib implementation of
  4. the MerkleTree. Specification: https://docs.iden3.io/publications/pdfs/Merkle-Tree.pdf and https://eprint.iacr.org/2018/955.
  5. Main characteristics of arbo are:
  6. - Allows to define which hash function to use.
  7. - So for example, when working with zkSnarks the [Poseidon hash](https://eprint.iacr.org/2019/458.pdf) function can be used, but when not, it can be used the [Blake2b hash](https://www.blake2.net/blake2.pdf) function, which has much faster computation time.
  8. - New hash functions can be plugged by just implementing the interface
  9. - Parallelizes computation by CPUs
  10. - See [AddBatch section](https://github.com/vocdoni/arbo#addbatch)
  11. ## AddBatch
  12. The method `tree.AddBatch` is designed for the cases where there is a big amount of key-values to be added in the tree. It has the following characteristics:
  13. - Parallelizes by available CPUs
  14. - If the tree size is not too big (under the configured threshold):
  15. - Makes a copy of the tree in memory (*VirtualTree*)
  16. - The *VirtualTree* does not compute any hash, only the relations between the nodes of the tree
  17. - This step (computing the *VirtualTree*) is done in parallel in each available CPU until level *log2(nCPU)*
  18. - Once the *VirtualTree* is updated with all the new leafs (key-values) in each corresponent position, it *computes all the hashes* of each node until the root
  19. - In this way, each node hash is computed only once, while when adding many key-values using `tree.Add` method, most of the intermediate nodes will be recalculated each time that a new leaf is added
  20. - This step (*computing all the hashes*) is done in parallel in each available CPU
  21. - If the tree size is avobe the configured threshold:
  22. - Virtually splits the tree in `n` sub-trees, where `n` is the number of available CPUs
  23. - Each CPU adds the corresponent new leaves into each sub-tree (working in a db tx)
  24. - Once all sub-trees are updated, puts them together again to compute the new tree root
  25. As result, the method `tree.AddBatch` goes way faster thant looping over `tree.Add`, and can compute the tree with parallelization, so as more available CPUs, faster will do the computation.
  26. As an example, this is the benchmark for adding `10k leaves` (with `4 CPU cores`, `AddBatch` would get faster with more CPUs (powers of 2)):
  27. ```
  28. Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz with 8GB of RAM
  29. nCPU: 4, nLeafs: 10_000
  30. Using Poseidon hash function:
  31. (go) arbo.AddBatch: 436.866007ms
  32. (go) arbo.Add loop: 5.341122678s
  33. (go) iden3.Add loop: 8.581494317s
  34. (js) circomlibjs: 2m09.351s
  35. ```
  36. And, for example, if instead of using Poseidon hash function we use Blake2b, time is reduced to `80.862805ms`.
  37. ## Usage
  38. ```go
  39. // create new database
  40. database, err := db.NewBadgerDB(c.TempDir())
  41. // create new Tree with maxLevels=100 and Blake2b hash function
  42. tree, err := arbo.NewTree(database, 100, arbo.HashFunctionBlake2b)
  43. key := []byte("hello")
  44. value := []byte("world")
  45. // Add a key & value into the merkle tree
  46. err = tree.Add(key, value)
  47. // There are cases where multiple key-values (leafs) are going to be added to a
  48. // Tree, for these cases is more efficient to use:
  49. invalids, err := tree.AddBatch(keys, values)
  50. // generate the merkle proof of a leaf by it's key
  51. value, siblings, err := tree.GenProof(key)
  52. // verify the proof
  53. verified, err := arbo.CheckProof(tree.hashFunction, key, value, tree.Root(), siblings)
  54. if !verified {
  55. fmt.Println("proof could not be verified")
  56. }
  57. // get the value of a leaf assigned to a key
  58. gettedKey, gettedValue, err := tree.Get(key)
  59. // update the value of a leaf assigned to a key
  60. err = tree.Update(key, value)
  61. // dump the tree (the leafs)
  62. dump, err := tree.Dump(nil) // instead of nil, a root to start from can be used
  63. // import the dump into a tree
  64. err = tree.ImportDump(dump)
  65. // print graphviz diagram of the tree
  66. err = tree.PrintGraphviz(nil) // instead of nil, a root to start from can be used
  67. ```
  68. ### Usage with SNARKs compatibility
  69. Arbo is designed to be compatible with [circom merkle
  70. tree](https://github.com/iden3/circomlib/tree/master/circuits/smt)'s
  71. snark-friendly merkletree.
  72. The only change needed is the hash function used for the Tree, for example using
  73. the Poseidon hash function:
  74. ```go
  75. tree, err := arbo.NewTree(database, 32, arbo.HashFunctionPoseidon)
  76. ```
  77. Be aware of the characteristics of this kind of hashes, such as using values
  78. inside the finite field used by the hash, and also the computation time.
  79. The interface of arbo uses byte arrays, and for the case of these kind of hashes
  80. (that usually work directly with finite field elements), arbo expects those
  81. values to be represented by little-endian byte arrays. There is a helper method
  82. to convert a `*big.Int` to `[]byte` using little-endian:
  83. ```go
  84. bLen := tree.HashFunction().Len()
  85. kBigInt := big.NewInt(100)
  86. // convert *big.Int to byte array
  87. kBytes := arbo.BigIntToBytes(bLen, kBigInt)
  88. // convert byte array to *big.Int
  89. kBigInt2 := arbo.BytesToBigInt(kBytes)
  90. ```