# Building a STARK

This post will step through how to build a simple zk-STARK. Whereas I have outlined the ideas already in a couple of posts (STARKS I and II), these took a very general approach and consequently were rather abstract. This post will be more concrete by detailing how one can be built for a specific calculation. For this, I pick a ‘toy’ example for ease of exposition, but which is still very interesting. The calculation will be based on the rule 30 elementary cellular automaton. There are many ways of actually building a STARK, some of which will perform better than others. I make the choices aiming for simplicity and to avoid complex constructions.

First, let’s recap what a zk-STARK is. It is an acronym for zero knowledge Scalable Transparent Argument of Knowledge. While they are often phrased as proofs of existence of some quantity (e.g., existence of 3-colorings of a graph), more properly, they are a proof that the person constructing the STARK has knowledge of this quantity. We can be a bit more precise. The setup must include some algorithm or computable function F and a value b which it may or may not return. This information will be publicly known, so that anyone can verify what F(a) does evaluate to on given input a. Anyone can also make a statement such as:

I know of a value a such that F(a) evaluates to b.

However, such a statement does not itself prove that they do really know of such a value. An argument of knowledge is a method of proving the truth of this statement. As an example, consider digital signatures. In this scenario, the value a is a private key and F is the calculation deriving the associated public key b. A digital signature is an argument of knowledge of a private key. For elliptic curve based schemes as used in ECDSA and Schnorr signatures, it can be shown mathematically that for any public key there does exist an associated private key, although it can be practically impossible to actually find it. Hence, a digital signature is not a proof that a private key exists, which we know already. Instead, it is a proof that it was known to the person constructing the signature.

One method of proving a statement like the one given above is to simply display the value of a, so that anyone can evaluate F(a) and verify that it does indeed equal b. There are two reasons why this may not be a desired or viable approach, corresponding to the other words making up the acronym zk-STARK. Firstly, the value of a may be a secret, such as with the digital signature example. It should be zero knowledge. Ideally, the proof does not reveal any information beyond the fact that the prover knows of a valid value for a. Secondly, it should be scalable.

By scalable, we mean that the size of a zk-STARK should remain small and reasonably quick to validate, even if F is very computationally expensive to compute. For example, we could take the value a and hash it billions of times to get the result b. If we want to convince others of the result, then expecting them to go through the whole tedious calculation might not be realistic. Instead, a STARK can be constructed, which anyone can check proves that the result is b as claimed without having to redo the entire calculation. This is especially useful for blockchains where the use of STARKS can greatly reduce the work required by validators and avoid using up a lot of the limited block space. A long list of transactions with their associated digital signatures could be replaced by a much shorter proof of the combined effect of these transactions.

The scalability property is really the magic part of STARKs and the closely related SNARKs. A very long calculation will take a long time to perform. However, once one person has done this, they are able to prove the result to others without requiring them to also go through the long calculation. We really do not have any right to expect that this is even possible, but luckily it is!

Next, STARKs are transparent, which means that there is no trusted setup. This is in contrast to the closely related SNARKs (scalable non-interactive argument of knowledge), where a procedure is required to initialize the contruction of the SNARKs, and involves data which must be kept secret by those involved. Revealing this data would allow ‘proofs’ to be constructed even though the prover does not have the knowledge being claimed. STARKs do not have this issue.

Finally, I mention that STARKs are non-interactive. Any argument of knowledge involves two parties, the prover and the verifier, with the former trying to convince the latter of the truth of a statement. Interactive proofs involve messages being passed back-and-forth between the two. Essentially, the verifier asks questions of the prover, and checks his responses, until she is convinced of the proof of the original statement. A non-interactive proof just consists of a single message sent from the prover to the verifier. This is what is needed in many blockchain applications, since the proof can simply be constructed and added to the chain. Anyone can be the verifier by reading the data from the blockchain, such as done by validators.

As with any argument of knowledge, STARKs are sound, meaning that it is practically impossible for someone without the claimed knowledge to construct a valid proof. However, I should point out that, in theory, it is always possible to just guess at a valid argument by blind luck. For this reason, any such construction will come with a soundness parameter. This is an upper bound on the probability that, without the claimed knowledge, any parameter choices made in its construction leads to a valid proof by chance. The idea is that this should be tiny to avoid such false positives. It is true that an untrustworthy prover could try over and over, choosing different parameters each time, to try and brute-force a solution. As long as the soundness parameter is small enough — say, about 2-100 or lower — then it becomes practically impossible to even brute-force a solution.

#### Automaton Rule 30

The toy calculation used in this post is the elementary cellular automaton Rule 30. The idea is that we have a one dimensional array of cells, indexed by the integers. Each cell can be in one of two states, either set or unset, which I will label by the numbers 0 and 1. We then iteratively update this array, one step at a time. At every step of the calculation, each cell is updated according to the value of it and its immediate neighbours.

Using si(t) to represent the state of cell i at time t, I will also denote this by just si, suppressing the time variable for brevity. Its value si(t + 1) at time t + 1, which I denote by si, is some function of si - 1, si, si + 1.

 s′i = S(si - 1, si, si + 1).

The rule is defined by the function S and, given any initial state, this can be applied to compute the state at every future time. There are many different choices for S giving different cellular automata with different properties. I choose rule 30, for which the state si is determined from si - 1sisi + 1 by looking up its value in the following table:

As there are 8 possible values which si - 1sisi + 1 can take, and each of these can lead to two possible states for si, there are 28 = 256 distinct elementary cellular automata. This particular one is called ‘rule 30’ since the second row of figure 1 is the binary expansion of the number 30.

To avoid dealing with infinite arrays, I will use a fixed window width of 200. That is, rather than an infinite array of cells, we just have 200 of them labelled as si for 0 ≤ i < 200. At the edges of the window, we allow the rule to wrap around, so that, on the left, s-1 is defined to equal s199 and, on the right, s200 is defined to be s0.

Then, if we start with the single cell number 50 set, and the others unset, repeated applications of the rule give figure 2 below. Here, the time starts from the top of the figure and increases towards the bottom. Each row of pixels represents the state at the corresponding time, with black pixels representing cells which are set and white those which are unset.

# Mimblewimble

No, I haven’t gone crazy. Named after a Harry Potter spell rendering the target incapable of speaking coherently, mimblewimble is a blockchain protocol which scrambles up the transaction quantities and addresses, giving both a high degree of privacy and an efficient representation allowing massive savings on block space. According to its anonymous inventor, “I call my creation Mimblewimble because it is used to prevent the blockchain from talking about all user’s information”. Remarkable properties include not just hiding transaction quantities, but also allowing all transactions to be merged into a single big transaction, and spent outputs combined with their corresponding inputs and stripped from the blockchain. This last point has potential huge savings on blockchain space and validation time.

The protocol was implemented by the Grin and Beam cryptocurrencies, launched in January 2019. More recently, it was introduced into Litecoin in the MWEB upgrade (Mimblewimble extension block). This formally locked in on 3 May 2022 and went live a couple of weeks later.

# Non-Interactive Proofs

This post discusses non-interactive zero knowledge proofs, which can also be considered as digital signature algorithms. These can be used in blockchains to verify that the creator of a transaction has access to some secret data satisfying specific defined properties.

I previously discussed zero knowledge proofs, where a party Bob has some secret data satisfying certain properties. For example, this could be the private key associated with a publicly known Bitcoin address, or it could be a file whose SHA256 hash is equal to a given value, or maybe he knows a 3-colouring for a specified graph. In fact, it could be any data satisfying a clearly defined computable property. Bob wants to prove that he has such information to Alice but, as it is secret, he does not want to send her the data itself. Ideally, he wants to prove this without revealing any knowledge besides the simple fact that he knows of something satisfying the claimed properties. As was discussed, this can be achieved by an interactive proof where he exchanges messages with Alice until she is able to conclude with a high degree of confidence that he does indeed have the claimed secret. This as in figure 1 below, where Bob is acting as the prover and Alice is the verifier. As I will build on such zero-knowledge proofs here, if you are not already familiar with them it is suggested to first read the earlier post.

Alice verifies that Bob’s messages satisfy some required properties and, if they do, she accepts his proof. Otherwise, she rejects it. Recall the important properties:

• Completeness: If Bob has the claimed information, he is able to send messages to convince Alice of this fact.
• Soundness: If Bob does not have the claimed information, he is not able to fool Alice that he does, other than with a very small probability.
• Zero-knowledge: The verifier learns no information beyond the fact that Bob has the knowledge.

To ensure soundness, Alice should select her messages at random according to a specific procedure. Essentially, she is challenging Bob with unpredictable questions, which he is very unlikely to be able to answer all correctly unless he has the claimed knowledge. If Alice sticks to this method, she is an honest verifier. Similarly, if Bob has the claimed information then, to ensure completeness and zero-information, he should also construct his messages according to a specific procedure which, again, are randomized to ensure that he does not leak any secret information. If he does this, then he is an honest prover.

In many situations, the setup described above is not desirable or is not possible at all. It may be preferable for Bob to send a single proof to Alice for her to verify without having to respond with further questions. Maybe there is no communication channel from Alice to Bob, or maybe she will check the proof at a later date when she no longer has contact with Bob. This is the case, for example, where Bob is submits a transaction containing the proof to be included in a blockchain, and Alice is a network validator who later verifies this transaction. These are non-interactive proofs, so figure 1 should be updated to only include a single message sent from the prover to the verifier.

Technically, it is not possible for valid non-interactive proofs to be zero-knowledge, at least by the standards described in the previous post. For one thing, Alice could take the message sent from Bob, and pass it off as her own to try and prove that she has access to the secret data, which is incorrect. In fact, Bob may have already done this, and simply passed off a proof originally from someone else as his own. So, the best that we can expect, is for the message to prove that someone had access to the secret and constructed the message. Continue reading “Non-Interactive Proofs”

# Zero Knowledge Proofs

Alice is busy trying to find Waldo in the picture above (or, Wally, for those outside of North America). After some time without success, she is doubtful that he is even in the image at all. Bob assures her that Waldo is there, since he already found him. However, this is not enough to satisfy Alice. She needs proof before spending any more time on the task. How is it possible for Bob to prove this to her, without giving the game away by pointing out his location?

Assuming that the picture is printed out on a sheet of paper, Bob could take a large piece of cardboard and cut out a small Waldo-shaped hole. Then, while keeping the hole covered, he slides the paper underneath and, after correctly positioning it, he uncovers the hole to reveal Waldo! Alice can look with her own eyes and see his face, which looks like this: . She agrees that Waldo is in the image but, since all of the rest of the picture is covered by the cardboard, Bob has not revealed the location. So, she continues with her task in the knowledge that, at least, it is possible.

What Bob has done here, is provided a zero-knowledge proof that Waldo is in the picture. That is, he did this while providing no information to Alice other than the fact that he is there. You could argue that he did provide some information. Specifically, he showed Alice the exact size and appearance of Waldo’s face. However, if we assume that this information is no secret and is already publicly known, then it is true that Bob did manage to prove Waldo’s existence in the image without revealing any other previously unknown information. Continue reading “Zero Knowledge Proofs”

# STARKS II: Is that a polynomial?

The previous post outlined how a potentially long and complicated calculation can be represented by polynomials. Correctness of the calculation result can then be verified by checking that their values satisfy some simple identities. One issue that was avoided was how to identify if the values are indeed given by polynomials.

Bob (the prover) has performed a long and intensive calculation, and sent the result to Alice (the verifier). To be sure, Alice wants to verify the result, so requests that Bob also sends a proof of its correctness. The problem is that Alice does not have the time or computational resources to replicate the entire calculation, so it is necessary for Bob’s proof to be significantly shorter than this. So, Bob computes polynomial interpolants of the execution trace, and Alice requests their values of at a small number of random points. She checks that these values satisfy some simple algebraic identities which encode the calculation steps. If they are satisfied, then she agrees that Bob sent the correct result. However, this is assuming that Bob can be trusted. He could be sending any values selected to satisfy the identities. If we are not willing to simply trust that he has performed the original calculation correctly, why would we trust that he is really evaluating the polynomials? Alice needs a way of checking that the received values are consistent with polynomial evaluation. We will discuss how this can be done, and the solution to this problem forms a major component of STARK implementations for blockchain scaling.

# STARKS I: Polynomials, everywhere!

Terms such as zk-SNARKs, zk-STARKs and zk-Rollups are becoming increasingly common when discussing scalability solutions for blockchains. The idea is that, we do not send long and complex calculations to be computed on-chain, since it is expensive and inefficient, and there is limited capacity for such contracts. Instead, the calculation is performed off-chain, and only the result together with a proof that it is the result of the calculation is sent for inclusion in the blockchain. It turns out that it is possible to create a proof that a calculation leads to a particular result, such that the proof is much shorter than the calculation itself. Of course, the process of constructing these proofs is rather complicated, but that is done off-chain. In-depth descriptions of zk-SNARKS and zk-STARKS are usually rather long, and consist of several different and complex ideas all brought together. Here, I look at one of these ideas, which is replacing the original computation by expressions involving polynomials.

Suppose that Alice and Bob each have a data file of length N bits, and they want to check that they are identical. Of course, Bob could just transmit his information to Alice, and she can confirm if they are indeed the same. However, these are large files, and communication is slow. Instead, Bob could send only the bits from a randomly chosen set of places in his file. Alice can check that, at least, their data agrees in these locations. If a significant proportion of their data bits are different, then this has a high probability catching the discrepancy. However, it is possible that they differ at only one place, in which case Bob would need to send all or almost all of the data to have a good chance of catching this.

How can they do this is an efficient manner? How many bits does Bob have to send to Alice in order for her to have a high probability of telling whether or not their data agrees for every bit. At first thought, you may think that he needs to send all or almost all of it, so about N bits. On the other hand, if you are familiar with hash functions, then you would probably answer that he could compute something like the SHA256 hash and send this instead, so only 256 bits need to be transmitted. While that is true, there is also an efficient approach using polynomials, where we can prove its efficacy with some simple maths and without assuming any of the properties of hash functions. Also, as we will see, polynomials allow us to do so much more than just comparing bits. Continue reading “STARKS I: Polynomials, everywhere!”

# Schnorr Signatures

As part of the upcoming taproot upgrade, Bitcoin will support the Schnorr digital signature algorithm, in addition to the existing ECDSA implementation. In many ways, Schnorr signatures are superior to ECDSA. Not only are they simple to implement, they are also theoretically easier to understand, more amenable to mathematical proofs of security, have smaller signature strings, and are consistent with key and signature aggregation as well as batch verification. Probably, the main reason that Schnorr signatures were not used from the start, is likely due to the fact that it was under patent until 2008, so was not standard at the time of Bitcoin’s creation. Both the ECDSA and Schnorr implementations are based on the same secp256k1 elliptic curve. I give a description of Schnorr signatures, and how they are implemented by Bitcoin, in this post.

#### The Setup

The Schnorr signature method requires us to first agree on the following components, which define the implementation.

• A cyclic group E of prime order p. This is usually taken to be an elliptic curve over a finite field which, for Bitcoin, is taken to be the secp256k1 curve. This is the curve defined by the equation y2 = x3 + 7, with components (x,y) belonging to the field of integers modulo the 256 bit prime number
 q = 2256 – 232 – 29 – 28 – 27 – 26 – 24 – 1.

The group itself is not of size q. Rather, it is cyclic with order equal to another 256 bit prime number

 p=`0xfffffffffffffffffffffffffffffffebaaedce6af48a03bbfd25e8cd0364141`

To be precise, we should also specify exactly how elements of the group E are represented as binary strings, so that they can be transmitted across the internet and stored in Bitcoin transactions. I will describe this further below.

There is also a minor choice of notation to be made, although it does not impact the implementation at all. Groups can be written either multiplicatively or additively. In this post, I will use the additive notation. This is just a personal choice in writing out the formulas, but should be mentioned since some descriptions such as the Wikipedia page use multiplicative notation, meaning that the formulas look slightly different.

• A generator G for the group. As the group is of prime order, a generator is just any element other than the identity. Although it is arbitrary, for completeness I state the choice used by Bitcoin, which is (x,y) with,
 x=`0x79be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798` y=`0x483ada7726a3c4655da4fbfc0e1108a8fd17b448a68554199c47d08ffb10d4b8`
• A cryptographic hash function h which, in the Bitcoin implementation, is SHA256.

With these choices made, we need to describe three things to have a complete digital signature algorithm. These are, the method of constructing a public/private key pair, the algorithm to verify the digital signature associated with a message, and an algorithm for the owner of the private key to construct such signatures.

#### Key Pairs

The first step in any digital signature algorithm is the selection of a public/private key pair. This is performed by the user who needs to be able to validate messages or transactions. They should choose a private key at random, and compute the public key from this. The public key is, as the name suggests, distribute publicly, while the private key must be kept secret. Anyone in possession of the private key would be able to sign messages, and spend the Bitcoin secured by the associated public key.

In the Schnorr algorithm, the private key is just an integer x modulo p. This is represented by a unique integer value in the range 0 ≤ x < p. For security, the private key should be chosen at random.

The associated public key is a point in the group E. It is computed by multiplying the base point G by the private key,

 P = x.G. (1)

As long as x is chosen at random, then P will be a completely random point of E.

Recall that the private key should be kept secret, since anyone in possession of this would be able to sign transactions. However, looking at equation (1), it appears that the private key can simply be backed out from the public one. By construction, it is the unique integer (mod p) which, when multiplied by G, gives the public key. In principle this can indeed be done, but the idea is that this is a hard problem for the choice of group E. This is the discrete logarithm problem (in our choice of additive group notation, it looks more like a discrete division problem). For elliptic curves such as the secp256k1 used by Bitcoin, there is no known way to solve for x other than a brute-force search which, given the huge size of the group, is computationally infeasible.

It is the hardness of the discrete logarithm problem that gives this method its security. Reconstructing private keys from public ones or, alternatively, producing valid signatures without knowledge of the private key, requires solving this problem which, for the Bitcoin elliptic curve, has no known feasible method of solution. More specifically, given an element P of the group, chosen uniformly at random, it should be impossible to solve (1) for x with anything other than a vanishingly small probability of success. Continue reading “Schnorr Signatures”

# The Group Structure of Elliptic Curves

In the previous discussion of elliptic curves we saw that, once we know some points on the curve, then these can be used to algebraically construct new curve points. Specifically, from two points, we take the line joining them and solve for the third point of intersection. This construction gives a binary operator. In mathematics, one of the most useful type of binary operators are those defining a group. There is a lot of mathematical theory developed for groups so, if we have a group operator, then it is important to realise this. Unfortunately, the operator defined on an elliptic curve as just described does not satisfy the requirements for a group. Specifically, it is not associative, meaning that if we combine together more than two elements then the result depends on the order of the operations. However, it turns out that, with a slight modification, the binary operator on an elliptic curve does define a group. Such groups are used in the digital signature algorithms employed by Bitcoin and other cryptocurrencies.

Recall that an elliptic curve is the solution set to a third order polynomial equation. For now let’s start with the relatively simple case where the equation is in Weierstrass normal form,

 y2 = x3 + ax + b. (1)

Here, the constant coefficients a and b lie in some base field F, as do the solutions for (x,y). In addition, there is one point at infinity 𝒪, which was required so that the binary operator is everywhere defined, and comes from the fact that we should consider algebraic curves as lying in projective space and, hence, can have points on the line at infinity. Also, it is assumed that the curve does not have singular points which, for the one given by (1), means that the cubic on the right hand side does not have a double root. This is equivalent to 4a3 + 27b2 being nonzero.

As it is symmetric in y, the solutions to (1) are symmetric in reflection about the x-axis. Specifically, for a point P = (x,y) on the curve, then (x,-y) is also on the curve. I will denote this reflected point by Pr. By convention, the point at infinity is equal to its own reflection, 𝒪r = 𝒪. As in the post on elliptic curves, I use PQ to denote the third point of intersection of the line joining P and Q with the curve. Combining this with a reflection about the x-axis does indeed give a group operation.

Theorem 1 Let E be an elliptic curve in Weierstrass normal form, and point at infinity 𝒪. Then, it is a commutative group under the binary operator

 P + Q ≡ (P○Q)r

Furthermore, the group identity is 𝒪 and the inverse of an element P is -P = Pr.

The group operator ‘+’ is as shown in figure 1 below. It is not at all obvious from looking at the plot that this operator satisfies associativity. In fact this is a rather tricky property to prove, and I go into more detail on how to prove this further below.

# Group Theory

Groups are a foundational algebraic concept used across many different branches of both pure and applied mathematics. In particular, the ECDSA and newer Schnorr digital signature algorithms used by Bitcoin are based on group theory. I will go over a small amount of the theory here, including the parts that are relevant for the signature algorithms.

Simple examples of groups include the real (or, rational, integer, complex, etc) numbers under the operation of addition, or the nonzero real (or rational, or complex) numbers under multiplication. Generally, a group is a set G together with a binary operator, which I will denote by ‘·’,

 G × G → G, (a, b) ↦ a · b.

For elements a,b of the group, the value a·b is referred to as their product or group product and, for brevity, it is also often written as ab. There are certain algebraic properties or axioms that are required in order for G to be a group. The first is associativity,

 (a · b) · c = a · (b · c), (1)

which must hold for all a,b,c in G. Continue reading “Group Theory”

# Projective Curves

Here, I will fill in some of the gaps (quite literally) from the earlier post on algebraic and elliptic curves. There, I showed how we can, algebraically, construct points on a curve in terms of already known ones. For curves of degree 2, or conics, this led to a parameterisation of the entire curve. For degree 3, or elliptic curves, this gave a binary operator allowing us to construct one new point out of any two existing points which, as I will cover later, leads to the group structure used in the Bitcoin digital signature algorithms. If you have not already, I would suggest reading that post before coming back to this one. However, there was one small problem that we encountered. For some special cases, the construction is not well-defined and directly applying the algebraic formulas would give divide-by-zero errors. We interpreted this as giving points at infinity. However, it was not clear what the points at infinity are, or how they should be handled algebraically. I will state upfront, this post consists of the mathematical background to better understand elliptic curves. If you are happy with the explanation given in the previous post, and to take at face value how to handle the point at infinity of the curve used by Bitcoin, then this post can be skipped. On the other hand, if you want to understand the mathematics better, then read on.

As an example, consider the hyperbola in figure 1 given by the solutions to ${x^2-y^2=1}$. Most lines that intersect the curve at all, will intersect it at precisely two points. There are tangents to the curve, that will intersect at one point, but with multiplicity two. However, lines which are perpendicular to one of the asymptotes have gradient 1 or -1 and will intersect at most once. To fill in this missing point, we interpret these lines as also intersecting the curve at a point at infinity. All parallel lines intersect at the same point at infinity and, in the case of the hyperbola, the asymptotes themselves (which have gradient 1 or -1 and pass through the origin) are interpreted as intersecting a point at infinity with multiplicity 2.