Blockchain State Storage Rent Revised

(This post is an updated re-post of a previous post in RSK blog)

In a nutshell, storage rent is a fee users pay in order to have their accounts, contracts and memory live on the network at any time, so their data can be accessed fast and at a low cost. Storage rent does not fulfill any purpose in the short-term, but is required to assure the long term viability of a cryptocurrency platform. However the design and implementation of efficient storage rent is very tricky.  In many cases the amount of memory a user persists in a contract is so small that the rent becomes a micro-transaction, and the cost of processing the rent payment is higher than the amount paid itself. A seemly reasonable implementation of storage rent may fail to consider this cost imbalance. One also has to consider the CPU and space costs of storage accounting and the cost of managing misbehaving contracts. This management tasks can can easily become bottlenecks to scalability. At RSK we considered several designs, their pros and cons, until we settled on the current approach. You can see the different RSKIPs representing the distinct approaches in our github repository. We must note that Ethereum community also discussed adding storage rent to their platform, but the proposal was dismissed

The Cost of Full Nodes

As the blockchain state at the best block grows, the cost of maintaining a full node is dominated by the cost of fast access to the blockchain state and not from the cost of accessing historic blockchain data, Accessing the blockchain state must be fast to increase the throughput and security of the network (reducing the block propagation critical path), while accessing historic parts of the blockchain doesn’t have any hard efficiency constraint.  In other words, if a new block verification takes too much time, then miners are incentivized to mine empty blocks or empty uncles. However, there is no need for fast access to historic blockchain data as it can be downloaded slowly in background from different peers even while the full node verifies the latest blocks, if trusted checkpoints are used temporarily. Therefore the most valuable resource any cryptocurrency tries to protect from bloat is the blockchain state, and not the blockchain blocks. The blockchain state in RSK grows when new accounts or contracts are created, and when contracts request additional contract storage.

Who Should Pay for Blockchain State Storage

One of the unsolved problems of Ethereum is that state storage can be acquired at a low cost, or even zero cost, and never released, forcing all full nodes to store that state information forever.  The storage becomes free if there is no transaction backlog, so there is free block gas. There are almost no examples in real-world commerce where users acquire eternal rights over a property that requires continued maintenance performed by third parties, but it is acquired by a single non-recurring payment. But that is the case of blockchain state storage in Ethereum, and, to a lower extend, UTXO storage in Bitcoin. Maintaining the state space requires paying for electricity and the amortization cost of storage hardware, and the cost must be multiplied for every full node in existence. It can be argued that full nodes are altruistic, and therefore they are willing to incur in any state storage cost the cryptocurrency users demand. While this may have been partially true for Bitcoin nodes in the past, the altruistic behavior has stopped greatly as the blockchain size grown. The number of Bitcoin nodes is increasing slowly but the number of Bitcoin users has increased considerably more, so it’s unclear if node count will follow bitcoin demand. It is expected that block pruning and sharding techniques allow new users to commit a certain smaller amount of historic blockchain storage, but yet the state must be maintained in full for block verification. Requesting all state information required to verify a block (read or written branches of the current state trie) is generally not possible in real-time when propagating a new block, incurs in huge bandwidth consumption. RSKIP58 proposes a partial solution, where nodes send the modified state records but not the records read by smart-contracts during execution. This allows a form of header-only block propagation.

If the use of state storage is not protected from abuse, we risk to price out full nodes. Controlling the state size reduces the centralization pressure while maintaining a free market. Considering long term risks of state bloat and the uncertainty of Moore’s law and similar trends in the future, is clear that preventively users should pay a state storage rent. These central economic decisions cannot be later applied without breaking the community contract.

Who Should Rent be Paid To?

At a first glance, as full nodes store the blockchain and the state, it seems that storage rent should be paid to full nodes. However, HDD storage costs keep decreasing at a rate of 40% per year (this trend is known as Kryder Law), so under this trend the real cost of storage is bounded. A similar trend exists for SDD storage. Electricity cost of persistent storage tend to zero if the storage is not accessed, but it increases with the number of read or write accesses per second.

The bloat of state affects mainly miners, who cannot start mining a child block containing transactions before the parent block has been fully validated. If the state does not fit in RAM, or in SSD, then state access is greatly slowed down, and miners must mine empty blocks until they fully verify a candidate block. This creates  incentives for centralization, as bigger pools, having better hardware, would not suffer from this delay. Miners have incentives to use the faster and more reliable storage and faster CPUs to reduce the verification delay.  So it seems natural that storage rent is mainly paid to miners. Even if we would like to redirect part of the fees to full nodes, there is no tested secure protocol to perform this payment. In RSK we’re testing the Proof of Unique Blockchain Storage (PoUBS) protocol I designed a few years ago and I presented in 2017 at Devcon3. PoUBS is currently the only protocol that has the potential to solve full-node reward problem without trusted parties.

The Problem of Storage rent on DApps

For contracts that are controlled by a central entity, it’s clear that the owner should pay the storage rent. But for some community smart contracts it’s unclear who should pay for this rent. Many contracts (and probably the most interesting ones) are crowd-contracts: programs that are fueled and used by the crowd, without owners or managers. Crowd-contracts can consume a lot of contract storage, but no single user is in position of carrying the burden of the paying the rent. No single user will be willing to.

One can imagine that a well designed crowd-contract should have a revenue generation method for paying for the storage rent. For example, each crowd-contract method call could also pay to a special rent account where the crowd-contract collects all rent-oriented income. However, this approach has several problems:

  • Most crowd-contracts are meant to be immutable, such revenue collecting method must be defined before the contract is deployed. If there is a direct relation between a user and the memory it consumes, each user can pay a partial rent independently. But if this is not the case, and most users only receive a service but do not consume storage, then it will be unclear what proportion should be paid by each operation to collect enough funds before the rent deadline.
  • The cost in gas required to manage the rent collection process may be so high that makes the service offered too expensive. The collection process involves several steps such as computing the amount of rent each user must pay, collecting rents, keeping a registry of which users have or haven’t payed the rent, removing the data of users who did not pay the rent, etc.

In an ideal world, a DNS-like contract would manage an independent balance to pay a rent for each name registered, However, as previously stated, this approach may be highly inefficient as each rent payment for a small chunk of storage may represent a hundredth of a US cent. Why encapsulate this payment into a transaction that costs 100 times more?

The Problems with Collecting Rents for Fixed Periods

There are several complications that arise when trying to implement storage rent as a payment for a fixed period, ending at a certain specific deadline. Because RSK/Ethereum cannot schedule code execution, triggering the rent-paying code would need to be done from a message coming from the outside world, before the rent deadline comes. The payment interval should be long enough to prevent miners censoring the rent paying transactions. If deterministic deadlines are set to some fixed time after the contract creation time, then the simultaneous creation of multiple contracts in the same block can lead to a high number of contracts requesting deadline checks at the same time. Therefore rent-deadline events should be chosen randomly at contract creation time, so events become evenly distributed.

Monthly (or shorter) payment periods add too much pressure on users. As a comparison, owners of domain names prefer to pay annual fees rather than worrying about a monthly fee. With anual payments, rent payments for simple accounts without code would consume more computing resources to verify than the amount of rent being paid.

At RSK we tested a design based on fixed rent periods and realized it increases the complexity and cost of contract programming. Therefore we decided that the RSK platform would use a simpler approach and collect rent for the intervals between uses, rather than for long fixed periods of time. Also it will pospone rent payments until the amount to pay is over a minimum threshold.

Solutions to Micro-rents

One way to tackle the problem of micro-rents is by using a probabilistic approach. A random user (identified by its account) is pseudo-randomly chosen to pay the rent for all the users,  for a period. Another probabilistic approach is pseudo-randomly selecting one every 100 transactions that call a specific contract and force it to pay the full rent. The pseudo-random selection would be based on the block hash. However this implies that the result of this transaction cannot be reflected on the world-state of the current block, but on the next. Also if the parent block hash is used as the random seed, then this allows miners to re-order transactions to favor certain users not to pay the rent ever. Therefore a probabilistic approach seems inadequate.

A better way to avoid these problems is that every operation on a contract pays a rent proportional to the the amount storage the contract has acquired multiplied by the last contract inactivity period. This is not entirely fair, as a user who uses the contract a single time is forced to pay for all memory previously acquired. However, this system is fair assuming that:

  • miners do not reorder transactions in a block to favor some users (because the rents are micro-payments, there is little incentive to do so).
  • contract calls are spaced evenly over time, and not concentrated in a few blocks, after long periods of inactivity.

Under these assumptions, users will pay a share of the contract rent in proportion to their usage rate.

To Kill or not to Kill

One has to decide what to do if a contract does not pay the rent (or no user pays the rent for a contract). Killing the offending contract seems as an outrageous action: the user asset balances would be burned if the user forgets to pay the rent. Also a removed account would allow old transactions from that account to be replayed. A softer alternative is required. At RSK we analyzed two options:

  1. Forced Hibernation
  2. Cache eviction

Hibernation means a contract state is replaced by a single hash digest of it. Also, the block height where the contract was hibernated is stored. Later, the user can recover the contract, including its balance, by providing the missing pre-image information. A user can inspect the stored hibernation block height to query a peer and obtain the missing data, if a peer still has a copy of it.

Cache eviction means to move the contract data to a storage device with lower access time, and remove it from a faster more expensive memory. For example, high availability contracts could be kept in RAM, while infrequently used contracts are moved to SSD storage.

We decided to start implementing cache eviction, and only in a future community hard-fork propose to replace it with forced hibernation if cache eviction does not show in practice the scalability boost expected. Also because some RSK/Ethereum accounts consume so little space that hibernation actually increases the amount of data to store.  Hibernation of short accounts can still be beneficial if we allow cascaded hibernation of account nodes in a binary tree. However, this feature requires an additional binary tree to uniquely number all leaf nodes corresponding to contracts/accounts, as specified by RSKIP18.

Packing Rent Payments into Transactions

Because of the low cost of HDD and SSD storage, most rent payments will correspond to micro-transactions. We estimate a rent rate in RSK to be approximately 3000 units of gas per 32 bytes of storage per year. If paying the rent requires sending a unique transaction, the fees paid for a transaction (21K gas) far exceed the amount transacted. This is a huge overhead for the network and it’s against the common good. Therefore is preferable to piggy-back the payment to a transaction that performs another operation. This saves bandwidth and reduces computation time to a single digital signature verification. For RSK, we’ve added a new field to transactions (gasRentCount), where the transaction owner specified how much gas it will pay for storage rent for all contracts called either directly or indirectly. By using a different gas count field for the rent, we maintain compatibility with Ethereum contracts that make CALLs with embedded amounts of gas. When no rent needs to be paid, the field can be empty, and it consumes minimal transaction space. You can see the full specification of this approach in RSKIP61.

Previous (abandoned) attempts are documented in RSKIP07, RSKIP17, RSKIP21, RSKIP27, and RSKIP52.

Conclusions

Storage rent does not prevent short-term storage spam attacks but it’s needed for fairness and for long term scalability.  Without storage rent, the community may be sacrificing resources for short-term gains and preventing long term success. Storage rent also protects the blockchain against miscalculations and erroneous predictions on technology or blockchain adoption rate. However, implementing storage rent is complex, as many rent payments are micro-transactions, the system implemented must make sure the rent collection cost does introduce a new limiting factor to scaling. RSK will implement storage-rent payments in individual transactions, proportional to the target contracts last inactivity period and current storage consumption.

 

  1. Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: