What's on with Ethereum: Scaling

A lot of the conversations about blockchains nowadays are dominated by NFTs, DEFI, centralized exchange issues, etc. While those are all interesting topics, it’s important for builders in the space to shine a light on work that is being done behind the scenes. Topics like blockchain scaling, privacy applications, MEV, and so much more don’t get nearly as much attention. However, talking about the progress that is made behind the scenes will help those not in the trenches building understand what the applications of tomorrow might look like. It’s the applications of tomorrow that will get us closer to true mainstream adoption. Today I want to discuss the scaling of the Ethereum blockchain, why it matters, and what it can mean for future applications.

Why scaling matters

When we talk about the scalability of any system we are talking about the system's ability to handle changes in workloads and user demands. With regard to Ethereum, we are concerned with how the system handles increased on-chain activity. As the number of people that use the blockchain increases, the chain can hit capacity limitations. These limitations affect both the usability of the network as well as the cost to use the network.

Usability

As a society, we have become accustomed to how fast the internet is and the platforms that run on it. The progress of the internet over the past few decades has gifted us with rich user experiences and the expectation of near-instant communication. While we hope that Ethereum can become an integral component of the internet of tomorrow, it is not yet capable of scaling to the demand of platforms we use every day.

Currently, Ethereum can process about 15 transactions per second (TPS). For reference, it is estimated that Visa can handle somewhere around 65,000 TPS. Ethereum obviously is nowhere close to handling the amount of TPS that web2 industry leaders are capable of. This is an issue because since the throughput is so low, transactions must wait in the mempool till they get picked up.

The mempool can be thought of as a waiting area for transactions that have yet to be added to a block and are still unconfirmed.

For example, when you send a transaction from an exchange you often see a component on the page that is waiting for the transaction to be confirmed. During times of high load, you may be waiting for several minutes. It’s worth considering as well that Ethereum is a smart contract blockchain. So it needs to be able to handle more complex transactions than sending ETH from one address to another.

So we now know that Ethereum has a low throughput which means it can’t process that many transactions per second. Also, we know that we often hit that throughput limit which means that transactions have to wait in the mempool to get added to a block. Since the demand for transactions being included in a block is high this drives the price of transaction fees up.

Cost

Transaction fees are the cost you pay to interact with the Ethereum blockchain. These fees are called gas and gas is paid in Ethereum's native token Ether. I won’t cover much about the mechanisms of gas on the Ethereum network. All you really need to know for the purpose of this post is that the amount of gas needed for a transaction to be included in a block is dependent on the current demand of the network. Since the Ethereum network already struggles to keep up with demand there are times when gas is higher than the average user is willing to pay. This is not what we want when talking about the usability of a system.

Why is it so difficult to fix

So you might ask why hasn’t this been fixed yet. Turns out it’s not as easy and as straightforward as you might think. Ethereum is designed to be decentralized and secure. So if we are to make improvements to the scalability of the system we would need to do so without sacrificing decentralization or security.

Decentralization

When we talk about decentralization we mean that Ethereum is run by many trustless nodes across the work. As opposed to being centralized and run by a small group of trusted nodes. For example, Ethereum could increase the requirements to run a node (faster computers, more memory, etc.). However, this would mean that fewer people have the ability to run a node. This would in turn lead to fewer nodes and a system that is run by fewer people (less decentralized).

Security

For security, we are referring to a blockchain’s ability to resist an attack, even if there are malicious nodes on the network. Since Ethereum is a decentralized network, anyone can run a node and participate in the execution of the chain. There are no guarantees that someone participating has good intent. So you need a system that can defend against a percentage of malicious nodes. Ideally, it should be able to handle up to 50%.

Where are we now?

Currently, Ethereum is both decentralized and secure. Decentralized because it is run by thousands of nodes across the world, by both organizations and individuals. This is possible because the hardware requirements to run a node are low. It is also secure because every node in the network keeps a copy of the state of the blockchain and the consensus mechanism (Proof of Stake) makes sure that these nodes agree on the said state. Even if there is a percentage of them that aren’t honest.

The difficulty comes when you try to scale this system. As mentioned earlier by scaling we want the blockchain to run faster and cheaper. Vitalik Buterin coined the difficulty of this the “Scalability Trilemma”. Right now it’s relatively easy to create a blockchain that has two of the three attributes (decentralized, secure, and scalable). Most blockchains have to choose. As you can see in the diagram below there are chains that scale and are secure but might lack decentralization. The same is true for the other combinations.

So what’s the plan

The scaling roadmap for Ethereum has been debated since the creation of the blockchain. There have been attempts at scaling that have led to new blockchains being created such as alternative layer ones, sidechains, or layer twos.

Layer ones are blockchains that validate and execute transactions without support from another network, such as the Ethereum blockchain. Some others include Bitcoin, Cardano, and many others. There have been alternative chains to Ethereum that have been created that may value scaling more than Ethereum values decentralization and security. When deciding what layer one to use, a user must decide what attributes they value most.

Sidechains are blockchains that run independently of a layer one chain but create a two-way bridge that assets can flow through. This works by locking up an asset in one chain and creating a copy of it in another, and vice versa. This is beneficial because you can create a side chain that has attributes (like speed) that the mainchain doesn’t have. You can interact on the side chain and take advantage of the attributes the mainchain doesn’t have, and when you are done you can send your assets back to the mainchain. However, there are security concerns with this approach. Most notably the mainchain has no knowledge of what is going on with your assets on the sidechain.

Layer twos are separate blockchains that extend the mainchain (in our case Ethereum). Most recently, the Ethereum roadmap has focused on a layer two approach to scaling. This approach can enable the scaling we desire while still preserving the security and decentralization we already have.

Layer 2 scaling

There are several types of layer twos: state channels, plasma, validium, and rollups. However, for the sake of this post, we are going to focus on rollups because that seems to be the approach that is getting the most traction in the ecosystem at the moment (for many reasons I won’t cover here).

So what are Rollups? Ethereum.org gives a great short definition:

Rollups perform transaction execution outside layer 1 and then the data is posted to layer 1 where consensus is reached. As transaction data is included in layer 1 blocks, this allows rollups to be secured by native Ethereum security. — source

Since execution is happening off-chain on the layer 2 it can happen much faster and cheaper. However, we aren’t sacrificing the security guarantees of layer 1 Ethereum because after execution the state is submitted to the mainchain. In essence, the layer 2 chain can batch a bunch of transactions and submit them as one transaction to Ethereum mainchain. There are currently two main approaches to doing this: Optimistic rollups and Zero-knowledge rollups.

Optimistic rollups

In optimistic rollups, the layer two chain assumes that the submitted transactions are valid and just runs the computation to generate the new state. For example, after person one sends one ETH to person two, the L2 chain assumes the transaction is valid and the new state should say person two now has one ETH. That state is then submitted back to the Ethereum mainchain. We guarantee the security of the assets because a fraud-proof is submitted along with the chain and there is a lockup period in which someone can challenge the validity of the proof (ex: one week).

Similar to a sidechain, if you want to use an optimistic rollup you still have to lock up your asset/s on Layer 1 so that they can be used on the optimistic rollup. However, unlike a sidechain, there is no duplication of your asset since all of the state is safely maintained on the mainchain. If you want to exit back to the mainchain you just have to wait for the lockup period mentioned above to be over and your funds will be released back for use on the layer 1 mainchain.

There are several optimistic rollups already live and ready for use. Two popular ones are Optimism and Arbitrum.

Zero-knowledge rollups

Finally, we have zero-knowledge rollups. These rollups are similar to optimistic rollups as they take the load of computation off of the layer 1 mainchain. However, they don’t make assumptions about the submitted transactions. Rather they use zero-knowledge proofs (ZKP) to validate the transactions and submit a valid ZKP back to the mainchain along with the new state. A ZKP is a complex mathematical proof that guarantees the validity of the state that is submitted to the mainchain. I won’t go into detail about ZKPs here but I’ll post some useful links below for further reading.

In contrast to optimistic rollups, this has the added benefit that a fraud-proof does not need to be submitted back to the mainchain. Instead we can rely on the validity of the ZKP. So this means there is no lock-up period.

There are some ZK rollups that are currently live but only support limited capabilities. Such as supporting simple transactions or DEFI (decentralized finance). There is still work that needs to be done to make a general-purpose ZK rollup. By general purpose, I mean that any smart contract that can run on the Ethereum mainnet can also run on the ZK rollup. This is often referred to as a zkEVM (zero-knowledge Ethereum Virtual Machine). For this reason, most developers are currently building on optimistic rollups.

One team currently working on a zkEVM is zkSync. An example of a zk-rollup that isn’t general purpose yet is Aztec Protocol.

If you’d like to learn more about what teams are building rollups, I would recommend checking out L2BEAT. They do a great job of comparing all of the current Layer 2 protocols. Specifically, you can learn more about the design decisions of different layer twos, their market share, and much more.

Sharding

Now that we have covered layer 2s, it’s important to mention that we haven’t forgotten about the Ethereum mainchain (layer 1). Ethereum has a rich development roadmap. On the road map (estimated sometime in 2023 or 2024) is something called sharding.

Sharding, in a traditional sense, is the process of splitting a database horizontally to spread the load. This means that a database can be split into multiple databases where each database is responsible for a specific set of data. For Ethereum, a similar thing will be accomplished by splitting the chain into new chains, known as “shards.” This will mean that every node is no longer responsible for processing every transaction across the network. This will benefit the network by reducing congestion and increasing transactions per second.

Sharding works great with the Rollup scaling road map. Since rollups submit all state changes to the main chain, rollups can still be bottlenecked by the speed of the Ethereum layer 1. So any improvement to L1 scaling is also an improvement for L2 scaling.

How does this impact the future

So I hope I’ve convinced you that there are a lot of smart people working on scaling Ethereum. I want to talk about what that actually means for the ecosystem next.

Usability

It is estimated that with rollups we can get to around 5k transactions per second. That is a big improvement from the current 15 transactions per second. This still isn’t close to the estimated 65k transactions per second of Visa but it’s a drastic change and there are general-purpose optimistic rollups live today.

When sharding is finally added to Ethereum, it is estimated that Ethereum will support many thousands of transactions per second on the mainnet. This will enable a much higher throughput for layer 2s and will get us much closer to the scale that we expect of tech giants today.

Cost

Currently, the cost of transaction fees on Ethereum mainnet can be high during times of high on-chain activity. Such as a big NFT mint. However, if you use one of the live optimistic rollups or ZK rollups you can see costs that are much more reasonable. L2fees is a site that tracks the current fees across networks. At the time of writing this post a send ETH transaction on Loopring (ZK rollup) cost $0.01 and it cost $0.06 on Optimisim (optimistic rollup). Whereas the cost on Ethereum mainnet is currently at $0.45. This is already great savings and will continue to get better as rollups see more adoption and as mainnet makes more upgrades in the time to come.

Applications

Outside of DEFI and NFTs, we haven’t seen many mainstream applications get adoption. This is in part due to the scaling issues we have discussed. As a user, you wouldn’t want to use a decentralized chat app built on Ethereum that takes several minutes to send a message. However, when the scaling issues are solved we will start to see traditional applications that use the blockchain more often. For example, there is a social media protocol currently in development called Lens. This protocol aims to be the backend for current and future social applications. Imagine you build a big following on one social media platform but want to start using a new platform that comes out. With lens, all your social data (posts, followers, etc.) is on the blockchain and owned by you. You would be able to sign into the new platform with your existing data and all your posts and followers exist on the new platform. It’s applications like these that get me excited for the future of the tech and I think it is coming fast.

Why scaling matters

Usability

Cost

Why is it so difficult to fix

Decentralization

Security

Where are we now?

So what’s the plan

Layer 2 scaling

How does this impact the future

Usability

Cost

Applications

Additional Resources: