Key and Signature Aggregation

A Bitcoin upgrade known as taproot was recently approved and is due to go live in November 2021. While this brings several new features, one is the introduction of Schnorr signatures to replace ECDSA that has been used by Bitcoin since it was created in 2009. These are both methods of creating and verifying digital signatures that, from the outside, are very similar and play the same role. However, Schnorr signatures do have several benefits, both theoretically and practically. The feature that I will discuss here is the ability to combine multiple public keys and signatures into a single one via a process referred to as MuSig. This will be a high level discussion, without going into technical details of the implementation.

Recall the general process of creating signatures that was discussed previously. First, Alice creates a public/private key pair. The private key (or secret key) sk is chosen randomly, so that it virtually impossible for anyone to guess its value, and Alice must protect this and ensure that it is kept secret. The public key PK is shared freely, so that it can be used by any third party as part of the validation process. Next, suppose that Alice has a message that she wants to authorise. In the case of Bitcoin, the ‘message’ is a transaction spending her coins. Alice uses her private key sk together with the message in the signing algorithm to produce the digital signature. This is then distributed publicly along with the original message. Finally, any third party in possession of the public key PK can apply the verify algorithm to this signed message to determine that it was indeed authorised by Alice. If the verification algorithm outputs True, then it is valid, otherwise if it outputs False, then it is invalid. This is as in figure 1 below, where the third party or validator called Val performs the verification step. The important feature of the whole process is that only someone in possession of Alice’s private key is able to create a valid signature.

sign and verify
Figure 1: Signing and verification of a single signature.

Next, consider multi-signatures. One situation is where Alice wants to generate multiple public and private keys for additional security. Now, authorisation requires a signature corresponding to each one of the public keys. Then, for an attacker to authorise messages on her behalf and spend her coins, they would need to gain access to all of the corresponding private keys. Alternatively, we could have joint custody of an account by, say, three people, Alice, Bob and Carol. They each choose their own public/private key pair and distribute the public keys, while each keeping their private key secret. Given a message, then authorisation requires all three of Alice, Bob and Carol sign. This is as in figure 2 below. The multi-signed message consists of the original message together with all of Alice, Bob and Carol’s signatures.

multi-sign
Figure 2: Alice Bob and Carol all sign the same message.

Continue reading “Key and Signature Aggregation”

Algebraic and Elliptic Curves

Bitcoin and other cryptocurrencies make use of cryptographic techniques employing elliptic curves, such as in the digital signature algorithms. Most of the time we just make use of these methods, treating them as ‘black box’ algorithms, and do not worry too much about what is going on under the hood. If you do decide to look up descriptions of them, including the Wikipedia entries for ECDSA and Schnorr signatures, or the entry in Bitcoin Wiki, then you will be confronted with some rather abstract group algebra. Then, you might learn that the group used by Bitcoin is the secp256k1 elliptic curve described by the equation y2 = x3 + 7 over the field of integers modulo some huge prime. Digging further, the group operation is given by some point multiplication formulas. By this time, if you did not really understand what was going on to start with, it can begin to look like some kind of black magic.

In this post, I will explain the underlying mathematical ideas. The hope is to demystify the algorithms but, to be clear at the outset, it will involve quite a bit of algebra. If you are not already proficient at such algebraic methods, then it will require some time to build up a good understanding.

Elliptic curves, as used by Bitcoin, are special examples of algebraic curves. These are curves in the plane described by the points (x,y) solving a polynomial equation. For example, a circle of radius r is described by the equation x2 + y2 = r2 and a parabola is described by y = x2. There is an endless list of formulas that we could pick, but a slightly more complicated example is

y2 + 3xy – x3 + 4x = 1. (1)

Figure 1 shows the set of points in the plane solving this equation. There are also online equation graphers that you can use to play about with different polynomial expressions.

curve1
Figure 1: The curve y2 + 3xy - x3 + 4x = 1.

Continue reading “Algebraic and Elliptic Curves”

Merkle Trees

driving

As I described previously, cryptographic hash functions have the ability to compress an arbitrarily large block of data into a fixed size. The result, or hash, then acts as a unique identifier for the original data. Merkle trees do all this and more. The idea is that, as with a hash function, a large block of data can be represented by a single fixed-size identifier, known as the Merkle root or hash root. This much is exactly the same as with a regular hash function. However, if the data is broken down into a collection of individual parts, then it is possible to verify each part individually. This is different from regular hash functions, where the entire block of data is required in order to recompute the hash and verify the data correctness. Some example uses of Merkle trees are:

  • Transmission of large collections of files or databases. Instead of downloading a large database over the internet, we can first download its Merkle root. Then, the individual entries of the database can be downloaded as needed, possibly from different sources, and separately verified to be correct.
  • Efficient storage of large amounts of user permissions. Only the Merkle root needs to be stored then, when a user wants to verify his or her permissions, this can be done individually rather than requiring access to the entire database.
  • Bitcoin stores the Merkle root of the transaction data in the block header. This can be used by thin clients to verify that individual transactions are included in a block without having to download and store the entire blockchain. Only the headers are required to be stored.

Suppose that we have an array of N data blocks. Then, the procedure is as follows:

  • The Merkle root of the array of data is computed, which is a binary string of fixed length (256 bits is sufficient).
  • For any data block in the array, we can transmit this along with a short piece of information known as the Merkle proof, to any other party in possession of the Merkle root. Using this, they are able to verify that the data block is correct and was indeed in the original data.
  • The other party does not need to trust the person transmitting the data. They only need to be confident that they have the correct Merkle root. Then, the proof is sufficient for the verification, since it is impossible to create a valid proof for any data which was not in the original array.
  • The size of the Merkle proof is of the order of the logarithm of N. This is bounded for practical purposes. For example, it is very unlikely that we would ever have arrays longer than 2100 so, effectively, the base-2 logarithm of N is bounded by 100.

The magic here is that the other party only needs to have the 256 bit Merkle root stored locally. Then, to prove that a block of data is in the array, we only have to transmit this block along with a short proof, regardless of the size of the original array of data.


Constructing the Merkle Tree

I now go into the mechanics of a Merkle tree. The first thing that we need is a cryptographic hash function, which maps any block of data to an essentially unique binary string of fixed size. For example, the SHA256 hash function can be used, which creates hashes of 256 bits (Bitcoin actually uses a double, or repeated, SHA256 hash). Throughout this post, I assume a hash function has been chosen, which will be denoted by H.

We build a Merkle tree from a sequence of blocks of data as follows. First, take the hash of each item. Then, arrange them as the leaves of a binary tree, where each internal node is assigned a value equal to the hash of its two children concatenated together. For a sequence of 16 data items, labelled t0 through t15, this is as in figure 1. Here, I have labelled each node by its hash value hb for a binary string b indicating the position of the node in the tree. So, for example, we have h1001 = H(t9) and h100 = H(h1000h1001). These hash values can all be computed by starting at the leaf nodes and, iteratively, working up the tree. Finally, the Merkle root is the hash value h at the very top.

Merkle tree
Figure 1: A Merkle tree with 16 leaves.

Continue reading “Merkle Trees”

Hash Functions

sha

Cryptographic hash functions are amazing things, with almost magical properties. They have the ability to compress an arbitrarily large block of data into a fixed size, called its hash. For example, the SHA256 hash function will compress any amount of data into a unique string of 256 bits (32 bytes).

Mathematically, this is impossible. Consider how many strings of 256 bits exist. This is a straightforward question, and the answer is 2256. Then, how many strings of arbitrary length exist? Clearly, there are infinitely many. We cannot map an infinite set into a finite one while ensuring that distinct elements always have distinct images. More generally, we cannot map a large set into a smaller one in a one-to-one fashion. So, even restricting to inputs of length 257, there must exist pairs of distinct inputs that map to the same output hash.

What gives? Well, we can only ever evaluate the hash function a finite number of times. Adding up the entire number of times that anyone has evaluated an SHA256 hash ever in the world, we get a very large number, but it is still a lot less than 2256. All that we need is for the hash function to effectively be one-to-one, so that with all the time and computing power in the world, the probability of ever finding two distinct inputs with the same hash is vanishingly small.

It is for this reason that cryptographically secure hash functions have strong pseudo-random properties. Although the hash of any input string can be computed by a deterministic algorithm, the result obtained depends seemingly randomly on the input. It is tricky to make this idea mathematically precise, but one consequence is that the probability of finding two distinct strings with the same hash is about the same as randomly generating strings of 256 bits, and finding two that are equal. The space of strings of 256 bits is of size 2256 which, although finite, is huge. The probability of two totally randomly generated 256 bit numbers ever being the same is so small that it is effectively impossible. The situation is similar to that of digital signatures for a public key cryptosystem, where it is theoretically possible to find a valid digital signature without knowing the private key but, in practice, it is impossible.

For strong hash functions, there is no known way to go in the reverse direction, so that we cannot reconstruct a data string from its hash. In some ways this is a drawback since we cannot simply compress a file into 256 bits by a hash function and then decompress it later. In other ways, though, it is a big benefit of hash functions, as they allow private data to be represented in a secure way.

Cryptographic hash functions have the following two properties.

  • Preimage resistance. Given a hash, it is computationally infeasible to find an input string which hashes to that value.
  • Collision resistance. It is computationally infeasible to find any two distinct strings with the same hash value.

The first of these two properties is actually a consequence of the second. If we could find preimages of hashes, then we could also find two strings with the same hash. To just do this, randomly generate input strings of sufficient length and compute their hash. Then, find a preimage which, with high probability, will be distinct from the original string, giving a hash collision.

A few applications of hash functions are:

  • Error free transmission of large files. If we want to send a large file over the internet, then we can first send its hash. Then transmit the large file and, by computing the hash of the received file, we can detect if errors have occurred.
  • File storage. If we upload files to a file storage system, then we would like to know if any of our locally stored copies have any changes. Rather than having to download the entire filesystem to compare, we can just download the file hashes.
  • Storage of secret data. Secret data such as passwords can be stored in a secure way by just saving their hashes. It can then be verified that a password is correct by re-computing the hash, but the password itself never needs to be stored.
  • Proof of work methods used to secure blockchains rely on hash functions, which I will explain in another post.
  • Hash functions can be used to create an ID for referencing a block of data. For example, in Bitcoin, transaction IDs are created as the hash of a transaction, and every block is linked to the previous block by containing its hash.
  • Merkle trees allow arbitrarily large collections of data to be stored as a single 256 bit hash, and it is possible to verify that individual blocks of data are indeed contained in the collection without ever having to handle the whole database.

Continue reading “Hash Functions”

Digital Signatures

signatures

As discussed in a previous post, one important factor of any digital currency is the ability of a holder of the asset to prove ownership. Specifically, if they want to make a payment by transferring the asset to another party, they need to submit a transaction to the network and be able to prove that this transaction has indeed been authorised by them. It should not be possible for a third party to be able to create any such authorised transactions involving the same holding, otherwise anyone would easily be able to steal the money. This is achieved with digital signatures implemented by public-key cryptography.

To explain this, let us start with a prospective holder of the cryptocurrency, called Alice. The first thing that she needs to do is to create key pair. This consists of a private (or, secret) key sk and a public key PK. The process of creating these keys depends on the signature algorithm being employed, but for the elliptic curve crytography (ECC) used by Bitcoin, for example, the private key is a random 256 bit number, and the public key is another number (of possibly varying length) derived from this.

As the name suggests, Alice needs to keep her private key sk secret. If anyone was to obtain it then they would be able to make transactions on her behalf, and steal her money. On the other hand, if she was to lose it, then she would no longer be able to sign transactions and would effectively lose her holdings of the digital asset. It is also for the reasons of security that the initial creation of the private key is random otherwise, if people know how it is constructed, then they would be able to recreate her key. It turns out that guessing a random 256 bit number is an effectively impossible task. It is difficult to emphasise quite how hard this is, even making quadrillions of guesses every nanosecond over the entire current age of the universe would give a vanishingly small probability of ever guessing correctly. For all realistic purposes, it is impossible. On the other hand, Alice’s public key PK can be shared freely.

Once this is done, Alice is able to digitally sign any message to prove that it has been authorised by her. In this context, the message is simply a block of digital data (a sequence of bits). Alice uses the message and the private key in the signature algorithm to generate the signature, which is another sequence of bits. For the ECDSA (Elliptic Curve Digital Signature Algorithm) used by Bitcoin, the signature is another string of bits, of variable length. The original message together with the signature can be shared publicly. Any third party in possession of Alice’s public key can then take the signed message and, by a verification algorithm, confirm that it has indeed been authorised by Alice. At least, he can verify that it has been signed by someone in possession of Alice’s private key. This is demonstrated in figure 1, where the third party ‘Bob’ wants to check the validity of the signed message.

signature process
Figure 1: Digital signature and verification process

Continue reading “Digital Signatures”