Building a STARK

This post will step through how to build a simple zk-STARK. Whereas I have outlined the ideas already in a couple of posts (STARKS I and II), these took a very general approach and consequently were rather abstract. This post will be more concrete by detailing how one can be built for a specific calculation. For this, I pick a ‘toy’ example for ease of exposition, but which is still very interesting. The calculation will be based on the rule 30 elementary cellular automaton. There are many ways of actually building a STARK, some of which will perform better than others. I make the choices aiming for simplicity and to avoid complex constructions.

First, let’s recap what a zk-STARK is. It is an acronym for zero knowledge Scalable Transparent Argument of Knowledge. While they are often phrased as proofs of existence of some quantity (e.g., existence of 3-colorings of a graph), more properly, they are a proof that the person constructing the STARK has knowledge of this quantity. We can be a bit more precise. The setup must include some algorithm or computable function F and a value b which it may or may not return. This information will be publicly known, so that anyone can verify what F(a) does evaluate to on given input a. Anyone can also make a statement such as:

I know of a value a such that F(a) evaluates to b.

However, such a statement does not itself prove that they do really know of such a value. An argument of knowledge is a method of proving the truth of this statement. As an example, consider digital signatures. In this scenario, the value a is a private key and F is the calculation deriving the associated public key b. A digital signature is an argument of knowledge of a private key. For elliptic curve based schemes as used in ECDSA and Schnorr signatures, it can be shown mathematically that for any public key there does exist an associated private key, although it can be practically impossible to actually find it. Hence, a digital signature is not a proof that a private key exists, which we know already. Instead, it is a proof that it was known to the person constructing the signature.

One method of proving a statement like the one given above is to simply display the value of a, so that anyone can evaluate F(a) and verify that it does indeed equal b. There are two reasons why this may not be a desired or viable approach, corresponding to the other words making up the acronym zk-STARK. Firstly, the value of a may be a secret, such as with the digital signature example. It should be zero knowledge. Ideally, the proof does not reveal any information beyond the fact that the prover knows of a valid value for a. Secondly, it should be scalable.

By scalable, we mean that the size of a zk-STARK should remain small and reasonably quick to validate, even if F is very computationally expensive to compute. For example, we could take the value a and hash it billions of times to get the result b. If we want to convince others of the result, then expecting them to go through the whole tedious calculation might not be realistic. Instead, a STARK can be constructed, which anyone can check proves that the result is b as claimed without having to redo the entire calculation. This is especially useful for blockchains where the use of STARKS can greatly reduce the work required by validators and avoid using up a lot of the limited block space. A long list of transactions with their associated digital signatures could be replaced by a much shorter proof of the combined effect of these transactions.

The scalability property is really the magic part of STARKs and the closely related SNARKs. A very long calculation will take a long time to perform. However, once one person has done this, they are able to prove the result to others without requiring them to also go through the long calculation. We really do not have any right to expect that this is even possible, but luckily it is!

Next, STARKs are transparent, which means that there is no trusted setup. This is in contrast to the closely related SNARKs (scalable non-interactive argument of knowledge), where a procedure is required to initialize the contruction of the SNARKs, and involves data which must be kept secret by those involved. Revealing this data would allow ‘proofs’ to be constructed even though the prover does not have the knowledge being claimed. STARKs do not have this issue.

Finally, I mention that STARKs are non-interactive. Any argument of knowledge involves two parties, the prover and the verifier, with the former trying to convince the latter of the truth of a statement. Interactive proofs involve messages being passed back-and-forth between the two. Essentially, the verifier asks questions of the prover, and checks his responses, until she is convinced of the proof of the original statement. A non-interactive proof just consists of a single message sent from the prover to the verifier. This is what is needed in many blockchain applications, since the proof can simply be constructed and added to the chain. Anyone can be the verifier by reading the data from the blockchain, such as done by validators.

As with any argument of knowledge, STARKs are sound, meaning that it is practically impossible for someone without the claimed knowledge to construct a valid proof. However, I should point out that, in theory, it is always possible to just guess at a valid argument by blind luck. For this reason, any such construction will come with a soundness parameter. This is an upper bound on the probability that, without the claimed knowledge, any parameter choices made in its construction leads to a valid proof by chance. The idea is that this should be tiny to avoid such false positives. It is true that an untrustworthy prover could try over and over, choosing different parameters each time, to try and brute-force a solution. As long as the soundness parameter is small enough — say, about 2-100 or lower — then it becomes practically impossible to even brute-force a solution.


Automaton Rule 30

The toy calculation used in this post is the elementary cellular automaton Rule 30. The idea is that we have a one dimensional array of cells, indexed by the integers. Each cell can be in one of two states, either set or unset, which I will label by the numbers 0 and 1. We then iteratively update this array, one step at a time. At every step of the calculation, each cell is updated according to the value of it and its immediate neighbours.

Using si(t) to represent the state of cell i at time t, I will also denote this by just si, suppressing the time variable for brevity. Its value si(t + 1) at time t + 1, which I denote by si, is some function of si - 1, si, si + 1.

si = S(si - 1, si, si + 1).

The rule is defined by the function S and, given any initial state, this can be applied to compute the state at every future time. There are many different choices for S giving different cellular automata with different properties. I choose rule 30, for which the state si is determined from si - 1sisi + 1 by looking up its value in the following table:

current pattern 111 110 101 100 011 010 001 000
new state for center cell 0 0 0 1 1 1 1 0

Figure 1: Rule 30.

As there are 8 possible values which si - 1sisi + 1 can take, and each of these can lead to two possible states for si, there are 28 = 256 distinct elementary cellular automata. This particular one is called ‘rule 30’ since the second row of figure 1 is the binary expansion of the number 30.

To avoid dealing with infinite arrays, I will use a fixed window width of 200. That is, rather than an infinite array of cells, we just have 200 of them labelled as si for 0 ≤ i < 200. At the edges of the window, we allow the rule to wrap around, so that, on the left, s-1 is defined to equal s199 and, on the right, s200 is defined to be s0.

Then, if we start with the single cell number 50 set, and the others unset, repeated applications of the rule give figure 2 below. Here, the time starts from the top of the figure and increases towards the bottom. Each row of pixels represents the state at the corresponding time, with black pixels representing cells which are set and white those which are unset.

rule 30
Figure 2: Rule 30 with a single initial cell set

Continue reading “Building a STARK”

STARKS II: Is that a polynomial?

The previous post outlined how a potentially long and complicated calculation can be represented by polynomials. Correctness of the calculation result can then be verified by checking that their values satisfy some simple identities. One issue that was avoided was how to identify if the values are indeed given by polynomials.

Bob (the prover) has performed a long and intensive calculation, and sent the result to Alice (the verifier). To be sure, Alice wants to verify the result, so requests that Bob also sends a proof of its correctness. The problem is that Alice does not have the time or computational resources to replicate the entire calculation, so it is necessary for Bob’s proof to be significantly shorter than this. So, Bob computes polynomial interpolants of the execution trace, and Alice requests their values of at a small number of random points. She checks that these values satisfy some simple algebraic identities which encode the calculation steps. If they are satisfied, then she agrees that Bob sent the correct result. However, this is assuming that Bob can be trusted. He could be sending any values selected to satisfy the identities. If we are not willing to simply trust that he has performed the original calculation correctly, why would we trust that he is really evaluating the polynomials? Alice needs a way of checking that the received values are consistent with polynomial evaluation. We will discuss how this can be done, and the solution to this problem forms a major component of STARK implementations for blockchain scaling.

poly values
Figure 1: Bob chooses a polynomial, which he evaluates at points chosen by Alice. How can Alice be sure that the received values are really from a polynomial?

Continue reading “STARKS II: Is that a polynomial?”

STARKS I: Polynomials, everywhere!

Terms such as zk-SNARKs, zk-STARKs and zk-Rollups are becoming increasingly common when discussing scalability solutions for blockchains. The idea is that, we do not send long and complex calculations to be computed on-chain, since it is expensive and inefficient, and there is limited capacity for such contracts. Instead, the calculation is performed off-chain, and only the result together with a proof that it is the result of the calculation is sent for inclusion in the blockchain. It turns out that it is possible to create a proof that a calculation leads to a particular result, such that the proof is much shorter than the calculation itself. Of course, the process of constructing these proofs is rather complicated, but that is done off-chain. In-depth descriptions of zk-SNARKS and zk-STARKS are usually rather long, and consist of several different and complex ideas all brought together. Here, I look at one of these ideas, which is replacing the original computation by expressions involving polynomials.

Alice and Bob

Suppose that Alice and Bob each have a data file of length N bits, and they want to check that they are identical. Of course, Bob could just transmit his information to Alice, and she can confirm if they are indeed the same. However, these are large files, and communication is slow. Instead, Bob could send only the bits from a randomly chosen set of places in his file. Alice can check that, at least, their data agrees in these locations. If a significant proportion of their data bits are different, then this has a high probability catching the discrepancy. However, it is possible that they differ at only one place, in which case Bob would need to send all or almost all of the data to have a good chance of catching this.

How can they do this is an efficient manner? How many bits does Bob have to send to Alice in order for her to have a high probability of telling whether or not their data agrees for every bit. At first thought, you may think that he needs to send all or almost all of it, so about N bits. On the other hand, if you are familiar with hash functions, then you would probably answer that he could compute something like the SHA256 hash and send this instead, so only 256 bits need to be transmitted. While that is true, there is also an efficient approach using polynomials, where we can prove its efficacy with some simple maths and without assuming any of the properties of hash functions. Also, as we will see, polynomials allow us to do so much more than just comparing bits. Continue reading “STARKS I: Polynomials, everywhere!”