This post will step through how to build a simple zk-STARK. Whereas I have outlined the ideas already in a couple of posts (STARKS I and II), these took a very general approach and consequently were rather abstract. This post will be more concrete by detailing how one can be built for a specific calculation. For this, I pick a ‘toy’ example for ease of exposition, but which is still very interesting. The calculation will be based on the rule 30 elementary cellular automaton. There are many ways of actually building a STARK, some of which will perform better than others. I make the choices aiming for simplicity and to avoid complex constructions.
First, let’s recap what a zk-STARK is. It is an acronym for zero knowledge Scalable Transparent Argument of Knowledge. While they are often phrased as proofs of existence of some quantity (e.g., existence of 3-colorings of a graph), more properly, they are a proof that the person constructing the STARK has knowledge of this quantity. We can be a bit more precise. The setup must include some algorithm or computable function F and a value b which it may or may not return. This information will be publicly known, so that anyone can verify what F(a) does evaluate to on given input a. Anyone can also make a statement such as:
I know of a value a such that F(a) evaluates to b.
However, such a statement does not itself prove that they do really know of such a value. An argument of knowledge is a method of proving the truth of this statement. As an example, consider digital signatures. In this scenario, the value a is a private key and F is the calculation deriving the associated public key b. A digital signature is an argument of knowledge of a private key. For elliptic curve based schemes as used in ECDSA and Schnorr signatures, it can be shown mathematically that for any public key there does exist an associated private key, although it can be practically impossible to actually find it. Hence, a digital signature is not a proof that a private key exists, which we know already. Instead, it is a proof that it was known to the person constructing the signature.
One method of proving a statement like the one given above is to simply display the value of a, so that anyone can evaluate F(a) and verify that it does indeed equal b. There are two reasons why this may not be a desired or viable approach, corresponding to the other words making up the acronym zk-STARK. Firstly, the value of a may be a secret, such as with the digital signature example. It should be zero knowledge. Ideally, the proof does not reveal any information beyond the fact that the prover knows of a valid value for a. Secondly, it should be scalable.
By scalable, we mean that the size of a zk-STARK should remain small and reasonably quick to validate, even if F is very computationally expensive to compute. For example, we could take the value a and hash it billions of times to get the result b. If we want to convince others of the result, then expecting them to go through the whole tedious calculation might not be realistic. Instead, a STARK can be constructed, which anyone can check proves that the result is b as claimed without having to redo the entire calculation. This is especially useful for blockchains where the use of STARKS can greatly reduce the work required by validators and avoid using up a lot of the limited block space. A long list of transactions with their associated digital signatures could be replaced by a much shorter proof of the combined effect of these transactions.
The scalability property is really the magic part of STARKs and the closely related SNARKs. A very long calculation will take a long time to perform. However, once one person has done this, they are able to prove the result to others without requiring them to also go through the long calculation. We really do not have any right to expect that this is even possible, but luckily it is!
Next, STARKs are transparent, which means that there is no trusted setup. This is in contrast to the closely related SNARKs (scalable non-interactive argument of knowledge), where a procedure is required to initialize the contruction of the SNARKs, and involves data which must be kept secret by those involved. Revealing this data would allow ‘proofs’ to be constructed even though the prover does not have the knowledge being claimed. STARKs do not have this issue.
Finally, I mention that STARKs are non-interactive. Any argument of knowledge involves two parties, the prover and the verifier, with the former trying to convince the latter of the truth of a statement. Interactive proofs involve messages being passed back-and-forth between the two. Essentially, the verifier asks questions of the prover, and checks his responses, until she is convinced of the proof of the original statement. A non-interactive proof just consists of a single message sent from the prover to the verifier. This is what is needed in many blockchain applications, since the proof can simply be constructed and added to the chain. Anyone can be the verifier by reading the data from the blockchain, such as done by validators.
As with any argument of knowledge, STARKs are sound, meaning that it is practically impossible for someone without the claimed knowledge to construct a valid proof. However, I should point out that, in theory, it is always possible to just guess at a valid argument by blind luck. For this reason, any such construction will come with a soundness parameter. This is an upper bound on the probability that, without the claimed knowledge, any parameter choices made in its construction leads to a valid proof by chance. The idea is that this should be tiny to avoid such false positives. It is true that an untrustworthy prover could try over and over, choosing different parameters each time, to try and brute-force a solution. As long as the soundness parameter is small enough — say, about 2-100 or lower — then it becomes practically impossible to even brute-force a solution.
Automaton Rule 30
The toy calculation used in this post is the elementary cellular automaton Rule 30. The idea is that we have a one dimensional array of cells, indexed by the integers. Each cell can be in one of two states, either set or unset, which I will label by the numbers 0 and 1. We then iteratively update this array, one step at a time. At every step of the calculation, each cell is updated according to the value of it and its immediate neighbours.
Using si(t) to represent the state of cell i at time t, I will also denote this by just si, suppressing the time variable for brevity. Its value si(t + 1) at time t + 1, which I denote by s′i, is some function of si - 1, si, si + 1.
s′i = S(si - 1, si, si + 1). |
The rule is defined by the function S and, given any initial state, this can be applied to compute the state at every future time. There are many different choices for S giving different cellular automata with different properties. I choose rule 30, for which the state s′i is determined from si - 1sisi + 1 by looking up its value in the following table:
current pattern | 111 | 110 | 101 | 100 | 011 | 010 | 001 | 000 |
new state for center cell | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 |
As there are 8 possible values which si - 1sisi + 1 can take, and each of these can lead to two possible states for s′i, there are 28 = 256 distinct elementary cellular automata. This particular one is called ‘rule 30’ since the second row of figure 1 is the binary expansion of the number 30.
To avoid dealing with infinite arrays, I will use a fixed window width of 200. That is, rather than an infinite array of cells, we just have 200 of them labelled as si for 0 ≤ i < 200. At the edges of the window, we allow the rule to wrap around, so that, on the left, s-1 is defined to equal s199 and, on the right, s200 is defined to be s0.
Then, if we start with the single cell number 50 set, and the others unset, repeated applications of the rule give figure 2 below. Here, the time starts from the top of the figure and increases towards the bottom. Each row of pixels represents the state at the corresponding time, with black pixels representing cells which are set and white those which are unset.
