Any digital currency requires a method of representing quantities of the asset held by any parties, and transactions of these. The two main methods are the account model and the utxo (unspent transaction output) model, the latter of which is used by Bitcoin and which I will describe here.
I start with the idea of a transaction, which is simply a block of data containing a list of zero or more outputs and a list of zero or more inputs. Each of the outputs needs to contain a nonnegative number giving the associated quantity of the asset, and each of the inputs needs to contain a reference to the output of another transaction which it is spending. Note that it is not necessary for the inputs to contain the quantity, since this will be provided by the transaction output that it references. Each of the transaction outputs (txo) can only occur as an input of another transaction at most once. It must also be possible to arrange the transactions in order so that each input can only reference the output of an earlier transaction. By linking the inputs of each transaction with the referenced outputs, we obtain what is referred to as a directed acyclic graph (DAG).
Figure 1 gives an example with 6 transactions in total. The first two have no inputs but have outputs summing to 10, so both create 10 bitcoins out of nothing. The remaining transactions have the sum of their output quantities equal to the sum of their input quantities, so redistribute bitcoin but do not destroy or create any.
Once a transaction output has been referenced by the input of another transaction, it can no longer be used so is considered spent. Transaction outputs which are not spent are referred to as unspent, hence the acronym UTXO for unspent transaction output. The bitcoin blockchain can be considered as a state machine, where the state at any moment is just the set of its utxo’s. These utxo’s represent the total holdings of bitcoin by all participants at that moment in time. Any transaction to be added to this ledger must have its inputs reference distinct utxo’s from this set, which are then removed from the state and the transaction outputs added to the utxo set. With this model, a transaction must spend all or none of a utxo. It is not possible for a transaction spend half, or any other proper fraction, of a utxo.
Consider a particular transaction, with its inputs labelled from 1 to nin and outputs from 1 to nout. Use qiout to denote the quantity of output i and qiin for the quantity of the transaction output referenced by input i. Then, the transaction preserves the total quantity of coins if the sum of its inputs equals the sum of its outputs,
Most transactions should not be creating additional supply from thin air, in which case the sum of its output quantities must be no more than the sum of its inputs,
Bitcoin enforces this condition, with the exception of the very first transaction in each block which I will explain in a moment. In theory, floating point numbers could be used to represent the quantities of each output, but to avoid possible issues with rounding errors the use of integer arithmetic is usual. Then, 1 is the smallest quantity that is able to be transacted. To enable sufficiently small quantities to be spent by a transaction, this unit should be chosen to be a tiny fraction of the total supply. In the Bitcoin blockchain, a single unit is known as a satoshi, and one bitcoin is set at 100 million satoshis. This means that we can break a single bitcoin into 100 million parts, but can go no smaller than this. Since there is a maximum supply of 21 million bitcoin, this translates to 2.1 quadrillion satoshis.
Any transaction in which the sum of its output quantities is strictly less than the sum of its inputs is losing money, so you may wonder why we would ever want to do this. In fact, with Bitcoin, the difference of the sum of the inputs and the outputs is the transaction fee
According to the protocol, the miner of a bitcoin block is able to claim all of the fees of transactions in the block for himself. So, to encourage miners to actually include our transaction, we would need a nonzero fee, and a transaction with a larger fee is more likely to get included in a block.
Next, certain bitcoin transactions are allowed to have total outputs greater than the sum of its inputs, so generate new supply. According to the Bitcoin protocol, this is only allowed for the very first transaction in a block, known as the coinbase transaction. This is to provide the miner reward. The coinbase transaction has no inputs, and total output must be no greater than the sum of block reward (BR) and the total of the fees of the other transactions included in the block,
Technically, this is not quite true. The bitcoin protocol requires all transactions to have at least one input but, for the coinbase, it only has a single dummy input, which is effectively the same as no inputs. Note that allowing the coinbase transaction to generate the sum of the fees only produces new bitcoin in a quantity to compensate for that lost by the other transactions, so does not generate new supply. However, the block reward does create new supply. Initially, this block reward was set at 50 bitcoins, but halves every 210,000 blocks (approximately 4 years) in an event known as the halving or halvening. The most recent halving was in May 2020, at which the block reward was reduced to 6.25 bitcoins. Eventually, in another 100 years or so, the block reward will drop to zero and miners will only be rewarded by being able to collect fees.
Note that the UTXO model allows transactions to have multiple inputs and multiple outputs. This is for good reason.
Why must we allow transactions to have multiple outputs? To answer this, suppose that we own one bitcoin, given by a single utxo, but we want to pay someone 0.1 bitcoins. As a transaction must spend all or none of the utxo, this would be impossible with a single output. We could only spend the entire bitcoin at once or none at all. On the other hand, if we have two outputs, then one of these can have quantity 0.1 to pay the other party, and the remaining 0.9 bitcoins (or slightly less due to the fee) can be sent back to ourselves as change. This allows us to effectively break down our utxo into smaller quantities in order to be able to pay the correct amount.
Why must we allow transactions to have multiple inputs? Now, suppose that we own one bitcoin, but this is given by 10 separate utxo’s each containing 0.1 bitcoin, and we want to pay this bitcoin to another party. This would likely be the case if we had originally acquired our holding in 10 separate transactions of 0.1 bitcoin each. If a transaction could only have a single input, then the payment would require 10 separate transactions. This situation would just get worse and worse over time as the available bitcoin became broken down into ever smaller quantities. By having 10 inputs and a single output, we are able to make the payment in a single transaction.
Additional input and output data
There are still a couple of points to be addressed with the UTXO model as described so far. First, it was required that each transaction input has a reference to a transaction output. But, how is this reference expressed? This requires every transaction output to have a unique identifier which can be used in later transaction inputs to reference it. There are various ways in which this can be done, but Bitcoin works as follows. First, as the outputs of a transaction are listed in a specific order, its reference is given by an ID for the transaction itself along with the number of the output given by this output order. This still requires a unique identifier for the transaction though, which is called the transaction ID. Again there are several ways of defining a transaction ID but, for bitcoin, the ID is defined as a hash of the transaction itself. As cryptographic hash functions applied to different blocks of data never (or, is highly improbable) to give the same hash value, this leads to unique transaction IDs. Well, this is true so long as every transaction is described by a distinct block of data, which is almost always the case. In fact, it has happened that identical coinbase transactions have appeared on the blockchain, due to a miner paying the block reward to the same account in two separate blocks. This was a bug in bitcoin, and was fixed by simply counting any transaction with the same ID as a previous transaction as invalid, so it cannot be added to the blockchain.
Finally, we need to deal with the issue of ownership of each utxo. It cannot be the case that anyone can create a transaction spending any of the utxo’s, otherwise it would be useless. The solution is provided by the use of digital signatures, as explained in the previous post. Each transaction output can contain a public key. Every transaction input should then contain a digital signature, validating the transaction according to the public key contained in the output that it is spending. This means that a utxo can only be spent by the person in possession of the corresponding private key, as this is required to generate the signature, effectively giving them exclusive ownership of the utxo. More generally, transaction outputs can contain ‘locking’ scripts (the scriptPubKey) which specify the conditions for spending the output, and the inputs contain unlocking scripts (the scriptSig) which satisfy these conditions.