Leaf-Node weakness in Bitcoin Merkle Tree Design

This document describes a weakness in Bitcoin Design that reduces the security of SPV proofs and therefore SPV Wallets.  The weakness was discovered by me on August 2017, but during the responsable disclosure process I learnt it was previously known by some prominent members of the Bitcoin Core team. Using this weakness an attacker can create a valid SPV proof for a fake payment to a victim that is using a SPV wallet, the payment amount being an arbitrary number of bitcoins, and trick the victim into accepting this payment as valid.  Happily, exploiting this bug requires brute-forcing between 69 and 73 bits (depending on initial investment), each operation being a double SHA2, and there are very simple probabilistic protections that SPV wallets can implement easily. For example, an attack can be carried on with an investment of 3M USD (*). It is assumed that most SPV wallets will be vulnerable to this attack. Also vulnerable are automated systems that accept SPV proofs (such as the Blockstream Elements sidechain in non-federated mode and the RSK Bridge contract). A simple patch which it neither a soft nor a hard fork can prevent the attack described. Also, a second attack that forks the Bitcoin blockchain is presented, requiring the brute-forcing of 225 bits, so it’s only of theoretical interest.

The Problem

Bitcoin Merkle tree makes no distinction between inner nodes and leaf nodes. The depth of the tree is implicitly given by the number of transactions. Inner nodes have no specific format, and are 64 bytes in length. Therefore, an attacker can submit to the blockchain a transaction that has exactly 64 bytes in length and then force a victim system to re-interpret this transaction as an inner node of the tree. An attacker can therefore provide an SPV proof (a Merkle branch) that adds an additional leaf node extending the dual transaction/node and provide proof of payment of any transaction he wishes.

It must be noted that a problem reciprocal to this was identified by Andrew Miller [1]. He realized that internal nodes could be re-interpreted as transactions, but it seems he didn’t publish (or didn’t discover) the opposite attack: transactions re-interpreted as nodes. Note that for the coinbase transaction this weakness is not present: getting a coinbase transaction to be interpreted as a hash of two transactions requires brute-forcing at least 216 bits (so it’s unfeasible, unlike the weakness presented in this post).

Crafting a Transaction-node in 2^72 operations

The following diagram shows a 64-byte Bitcoin transaction, and how this transaction is split into two 32-byte chunks.

Absolute Offs 32-Byte chunk 32-byte offs Size Description Brute force bits (stage)
0 1 0 4 Version 0
4 1 4 1 Input count 0
5 1 5 27 Input 0 Tx hash (part 1) 0
32 2 0 5 Input 0 Tx hash (part 2) 40 (2)
37 2 5 4 Input 0 Txout index 17 of 32 (1)
41 2 9 1 Input 0 script length 8 (1)
42 2 10 0 Empty Script 0
42 2 10 4 nSequence 0 (anything allowed)
46 2 14 1 Output count 8 (1)
47 2 15 8 value 29 of 64 (1)
55 2 23 1 Output 0 script length 8 (1)
56 2 24 4 Output 0 scriptPubKey 0 (anything allowed)
60 2 28 4 lock_time 2 of 32  (1)
Total (1) = 72

Total (2) = 40

Let the transaction to become an inner node be T.  Let R be the new fake transaction that the attacker wants to build as T’s right child. As the victim is an SPV node, then the SPV proof will require the left 32 bytes of T as the complementary hash as part of the Merkle proof. Luckily, the attacker is free to provide any 32-bytes to the victim.
From now on, we assume that the attacker is a Bitcoin miner, and he owns 2^36-1 satoshis (= 687 BTC or about 5M USD as of June-2018) in a single UTXO A. The attack requires two stages, executing a different brute-force procedure in each.

Stage 1 Brute-forcing procedure

The attacker focuses on the second half of T. He wishes to build a transaction whose double-hash matches this second half. To be a valid Bitcoin transaction, some fields of the second half of T must have specific values. Those parts must be “brute-forced”, meaning that the attacker would need to try many transactions R until he finds one matching the desired patterns in T. Some other parts of T are free for the attacker to choose. Still some other parts of T are partially free, and they will be restricted according to other parts of T. Let’s analyze which parts of T need brute-forcing.

We will temporarily forget about the 5 bytes of the input 0 transaction hash that overlap the second half of T and we’ll let anything fill this part. Later, in a second brute-forcing stage, we’ll find a transaction that matches this tail.

The nSequence field can contain anything.

The LockTime field is partially brute-forced: the attacker makes sure the brute-forced uint32 is between 500000000 and 1501843940 (today as a Unix timestamp) or between 0 and 479042 (the current block height). This implies that the LockTime has elapsed. The combined numerical range amplitude is 1002322982, which is close to 230. Therefore, he only needs to brute-force 2 bits of the LockTime.

The higher bits of the Previn transaction output index must be 0 so the index is always less than 32768. Since the field is 32-bits in length, the 17 most significant bits must be brute-forced.

The transaction output amount  is only partially brute-forced. From the 64 field bits, 35 bits can be chosen freely because the attacker has 2^36-1 satoshis, so 35 free bits can represent any number lower than the amount.  The remaining  29 most significant bits must be zero and should be brute-forced. If the attacker has more BTC, he can reduce the number of bits to brute-force by consuming more BTC in A.

Note that the amount paid is not lost: the transaction T pays to a anyone-can-spend address but  the miner can immediately consume that output and move the funds to a safe address he owns. Also he will be collecting any transaction fees offered by the transaction T as T doesn’t have enough space to include a change output. The attacker can recover all the funds but creating the anyone-can-spend output and a high-fee transaction T will put the attacker in the risk of being himself victim of a fee-sniping attack: the possibility that another miner reverts the blockchain to mine a competing block including the transaction T to collect both fee or output amount in T. The attacker may need to mine several blocks in private (selfish-mining) to prevent the fee-sniping attack.

The input scriptSig can be a script of length up to 4 bytes, but it must contain a valid opcodes so that it can verify the scriptSig of the input (as. T is a valid Bitcoin transaction). The attacker will be able to shape this scriptSig as he wishes to make both scripts correct. Probably the best option reduce brute-forcing needs is to make it 4 bytes in length, and contain any mix of valid opcodes. Since the stack can contain elements inserted by the associated scriptPub, most opcodes will be valid. We will assume that one every 8  random list of 4 of bytes will result in valid script, so only 3 bits of brute-forcing will be needed.

The output scriptPub can be a script of length up to 4 bytes, but it must contain a valid anyone-can-spend script. Probably the best option to reduce brute-forcing is to leave this script empty, as a non-empty scriptSig would need to start with a small subset of push opcodes, because the script stack is empty when the scriptSig is executed.

Creating the Fake Transaction R

The attacker will try to create a fake transaction R whose transaction ID matches the right half of T and use this fake transaction to cheat a victim. An easy way to do it is to create a fake transaction having one output for the fake payment to the victim, and scan all possible values for the lock_time field until the transaction ID matches the second half of T. When the lock_time field space is exhausted, the input script can be slightly modified and a new 32-bit space of lock_time values is created. Again, an AsicBoost-like technique can be used to save one message-scheduler of the double-SHA2 operation.

The total number of bits to brute-force is therefore 72 bits. However, since the scriptSig and scriptPub

Stage 2 brute-forcing procedure

Now we must solve the problem of the 5 random bytes of the input transaction hash that were chosen by the first state.  We note that the first stage also produces a random transaction output index for this input. Let Q be the transaction output index randomly chosen in the first stage. The attacker creates a huge number of transactions W(i) consuming the input A and having Q outputs each. He aims to find W(j) whose ID tail matches the 5 bytes tail mentioned before. The last output of W(j) will be consumed by T.

The procedure to find W(j) is similar to the one used in stage 1. The attacker makes small changes to the tail of the transactions W(i) and re-hashes to obtain the ID, therefore creating new transaction IDs requires only two SHA-2 compressions (one to finalize the transaction hash and the double-hash). The lock_field can be used to iterate fast. The attacker tries to find a transaction whose transaction ID end in the 5-bytes tail chosen in the first stage. Let’s call this transaction W(j) simply P. This transaction consumes an output controlled by the attacker having A funds. Since 5 bytes must be matches, approximately 2^40 operations will need to be performed to find P.  An AsicBoost-like technique can be used to save one message-scheduler of the double-SHA2 operation.

The total number of bits to brute-force in the second stage is therefore 40 bits.

Interestingly, the brute-forced half of the transaction created in the first stage can be reused as many times if the second stage is re-done. The attacks can be carried out at different times, in different blocks. Therefore, after a successful first attack the following attacks require only brute-forcing 40 bit each. If the number of outputs in P is 33K, the amortized cost of the attack corresponds approximately to brute-forcing only 65 bits. A final transaction E is added in the same block to consume the output of T and recover the funds at risk.

c1.PNG

Fig 1. Chain of transactions in attacker’s block

C2.PNG

Fig 2. Merkle Tree of the Block containing the fake transaction R

The Cost of the Attack

The technology required to build a custom ASIC that performs the brute-forcing of the second stage is very similar to the technology used for Bitcoin miners.

A state-of-the-art Bitcoin miner reaches 14 TH/s and costs $1300 USD. An attacker that buys 1000 units, investing 1.3M USD in hardware, can scan a 72-bit space in 4 days. An additional 1M-10M USD for ASIC chip design and tape-out may be required, depending on the node density. If a chip that can accomplish a programmed pattern match search (instead of zero prefix search) already exists, the design/tapeout costs are saved.

Mining several blocks in private to confirm the fake transaction may be needed to prevent the other miners from reorganizing the blockchain in order to steal the fees and output of transaction T. If the attacker is not in collusion with 51% of the miners, then this may cost the attacker millions in rented hashing power, if available. If the attacker is colluding with 51% of the miners there is no additional cost.

Therefore, under favorable conditions to the attacker, the attack can be profitable if the attacker can cheat one or more users for 1.3M USD or more in total. As any rational actor receiving large amount of BTC will double-check the reception using a full node, only autonomous systems relying on SPV proofs (such as the Elements blockchain and the RSK bridge) may suffer from these attacks.

The attack presented in this document is not optimal: there are several trade-offs than can be made to reduce the number of bits to brute-force increasing other variables, such as initial investment. Locking twice the money in A allows the attacker to reduce the cost in mining hardware to a half, etc.

An (expensive) attack to partition Bitcoin

The same vulnerability allows an attacker to fork the Bitcoin blockchain in two without reconciliation, however this attack costs 2^240 double SHA2 operations, so the attack is impractical. The attack requires the attacker to mine a block A with only one specially crafted coinbase transaction of 64 bytes, and create a competing block B with 2 transactions where the pair of transactions IDs in B hashes to the coinbase transaction in A. Both blocks are broadcast simultaneously to different parts of the Bitcoin network. Therefore approximately 50% of the network receives A and the other 50%, B. The two sets of transactions in A and B are created so that they consume different UTXOs and/or create different UTXOs, and therefore both blockchains are valid, but incompatible.

The fact that the first transaction is the coinbase transaction, and a coinbase transaction requires 27 zero bytes and 1 fixed byte in the first 32 bytes, makes the attack practically unfeasible, as it requires brute-forcing 225 bits.

Remedy

There is no need to hard-fork or soft-fork Bitcoin, although a future programmed soft-fork or hard-fork could fix this vulnerability.  We propose two non-forking solutions and three forking solutions.

One easy non-forking remedy is that SPV wallets check that every internal 64-bit node of the SPV proof is not a valid transaction. First, there are no 64-byte Bitcoin transaction that pass standard-checks, so the presence of such transaction should rise an alarm. Even if such transactions were normal, the probability of a random 64-byte chunk to become a syntactically valid Bitcoin transaction is close to 2^-56. Therefore, the SPV client can flag the presence of such dual transaction-node as an attack, and refuse to accept the SPV proof.

Another way to fix SPV wallets is to require, along with a Merkle-proof of the inclusion of a transaction E, a Merkle proof for the coinbase transaction. Because building a dual transaction-node for the coinbase transaction requires brute-forcing 225 bits, showing a valid coinbase and its corresponding Merkle inclusion proof is enough to discover the tree height. Both the E branch and the coinbase branch should have equal tree depths.

This method has a drawback: the coinbase transaction may be large, so the method requires non-logarithmic space. If the coinbase transaction is larger than 64 bytes, we could show as evidence only the last a SHA2 mid-state of the first SHA2 application to the coinbase transaction and the transaction length in bytes: with these two field the coinbase id can be reconstructed, because SHA2 embeds the bit-length in the last message block. This allows the verified to effectively differentiate between a 64-byte node and the coinbase transaction. The security of this constructions is weaker and relies on the infeasibility to find free-start SHA2 pre-images. To complicate things, it is still possible (for the next 145 years) that a malicious miner includes a coinbase having exactly 64 bytes.  Therefore, the SPV client must also perform a special check in case the coinbase is exactly 64 bytes in length: in this case it must verify that the previn field contains 256 zero bits (of which 216 are stored in the left half). It’s currently infeasible to obtain a pre-image of a hash digest having 216 zero trailing bits.

Yet another solution that keeps the logarithmic-space constrain for Merkle-proofs is to present an inclusion proof for the right-most transaction ID. In case the number of transactions is not a power of 2 (a full tree), this id appears duplicated at the last two nodes of the tree, so that the tree height can be inferred. A complete solution would require soft-forking Bitcoin to prevent blocks having a full tree, which is nasty.

To summarize, the most space-efficient solution without a soft-fork would be hybrid: show inclusion proof of the rightmost tx id branch if the tree is not full, else the coinbase (showing mid-state and length) if length is greater than 64 bytes, or else show the full coinbase transaction.

A simple soft-forking solution is to invalidate blocks having transactions of size equal to 64.

Another soft-forking solution is to require that a new field “depth” requiring 4 bits is embedded in the block version field. This new fields should specify the Merkle tree depth minus one. By using 4 bits, a a tree of depth 16, containing up to 65K transactions can be specified. Full nodes much check that the actual tree depth matches the value of “depth” given. SPV nodes can check that the “depth” field matches the Merkle branch length received.

A hard-forking solution is adding a prefix to internal nodes of the Merkle tree before performing node hashing and adding a different prefix to transaction IDs and then also hash them. Ripple ([2]) uses a such a prefix system.

Thanks

Special thanks to Juliano Rizzo and Matias Marquez from RSK Labs R&D for helping me reduce the attack complexity. Also to Gregory Maxwell for discussing the best way to correct this problem.

(*) The amounts in USD described in this document were computed at when the weakness was discovered and now they may differ significantly.

[1] https://cs.umd.edu/~amiller/BTCRelayAudit.pdf

[2] https://wiki.ripple.com/Hash_Tree

History

Finding Date: August 4th, 2017 (identified as CVE-2017-12842).

Bitcoin Core Report Contact Date:  August 8th, 2017 (from the interactions we learnt that is was already known to some Core developers)

Publication Date: June 9th, 2018 (triggered from the exposure of the weakness in the Bitcoin mailing list)

June 13th, 2018: Luke Dashjr contacted me and told me he already knew about this weakness. I trust Luke’s word that this is the case, and as I said in the first paragraph, the weakness was already known by some developers. But I still don’t understand (1) why so many people knew about it  but underestimated it badly, (2) why there was no attempt to fix it.

May 12, 2023: The blog post was edited to fix many typos and make it clearer, without changing the ideas or procedures presented.

Author: Sergio Demian Lerner, RSK Labs

  1. Leave a comment

Leave a comment