Mar 30, 2021 | samczsun
This is the first in a series of blog posts about the bugs I've found in go-ethereum (Geth), the official Golang implementation of the Ethereum protocol. While you don't need a deep understanding of Geth in order to follow these blog posts, knowledge of how Ethereum itself works will be helpful.
This first post is about a bug in Geth's uncle validation routine which did not behave correctly given a specially crafted uncle. If exploited, this could have caused an accidental fork between Geth and Parity nodes.
Every blockchain has a canonical chain, which is defined through some metric like the total length of the chain or the amount of work required to produce the total chain. However, network latency means that sometimes two blocks can be produced at the same time. Only one block can be included in the canonical chain, so the other block must be left out.
Some blockchains such as Bitcoin will completely ignore these blocks, rendering them orphaned from the canonical chain. Other blockchains such as Ethereum will still reward miners who put in the effort but rolled a nat 1 on block propagation. In Ethereum, these orphaned blocks that still get included are known as uncles.
An uncle needs to satisfy certain conditions in order to be considered valid. First, all of the block's properties must be valid according to normal consensus rules, and second, an uncle block must be the child of a block at most 6 blocks away from the current head of the chain. However, one exception applies: while normal blocks must not be more than 15 seconds in the future, an uncle block is not restricted in this way.
Most programming languages have the concept of platform-dependent integers and fixed-width integers. Platform-dependent integers may be 32 bits or 64 bits (or something else!) depending on the platform that the program was compiled on. In C/C++ and Go you might use uint
while in Rust you might use usize
.
However, sometimes the programmer might want to guarantee that their variable can hold 64 bits of data, even if the platform is 32-bit. In these cases, the programmer can use a fixed-width integer type. In C/C++ that would be uint64_t
, in Go uint64
, and in Rust u64
.
The benefit of these built-in integer types is that they're all first-class citizens and thus are very simple to use. Consider this implementation of the Collatz Conjecture which supports 64-bit integers.
func collatz(n uint64) uint64 {
if n % 2 == 0 {
return n / 2
} else {
return 3 * n + 1
}
}
However, this implementation has a slight flaw, it doesn't support inputs which are larger than 64 bits. For that, we'll need big integers. Most languages support this either within the standard library itself such as big.Int
in Go, or through external libraries for C/C++ or Rust.
Unfortunately, using big integers has a big downside: they're much clunkier to use. This is illustrated in the reimplementation of the Collatz Conjecture, but supporting arbitrarily large integers.
var big0 = big.NewInt(0)
var big1 = big.NewInt(1)
var big2 = big.NewInt(2)
var big3 = big.NewInt(3)
func collatzBig(n *big.Int) *big.Int {
if new(big.Int).Mod(n, big2).Cmp(big0) == 0 {
return new(big.Int).Div(n, big2)
} else {
v := new(big.Int).Mul(big3, n)
v.Add(v, big1)
return v
}
}
Clearly, the 64-bit version is much simpler to write and read, and so it's no surprise that programmers like to use simple integer types when possible.
In Ethereum, most data is expected to fit within 256 bits, although some fields are only expected to be an integer value with no limit on size. Notably, the block timestamp Hs
is defined to be a 256-bit integer.
The Geth team attempted to be faithful to this definition by validating that the timestamp on an uncle block was not larger than 2256-1
. Recall that uncles have no restrictions on future mining.
// Verify the header's timestamp
if uncle {
if header.Time.Cmp(math.MaxBig256) > 0 {
return errLargeBlockTime
}
} else {
if header.Time.Cmp(big.NewInt(time.Now().Add(allowedFutureBlockTime).Unix())) > 0 {
return consensus.ErrFutureBlock
}
}
Unfortunately, the code then immediately coerced the block timestamp into a 64-bit integer in order to calculate the correct difficulty for the block.
// Verify the block's difficulty based in it's timestamp and parent's difficulty
expected := ethash.CalcDifficulty(chain, header.Time.Uint64(), parent)
This would not be so bad if Parity behaved in the same way, but Parity would saturate the timestamp at 264-1
instead of overflowing.
let mut blockheader = Header {
parent_hash: r.val_at(0)?,
uncles_hash: r.val_at(1)?,
author: r.val_at(2)?,
state_root: r.val_at(3)?,
transactions_root: r.val_at(4)?,
receipts_root: r.val_at(5)?,
log_bloom: r.val_at(6)?,
difficulty: r.val_at(7)?,
number: r.val_at(8)?,
gas_limit: r.val_at(9)?,
gas_used: r.val_at(10)?,
timestamp: cmp::min(r.val_at::<U256>(11)?, u64::max_value().into()).as_u64(),
extra_data: r.val_at(12)?,
seal: vec![],
hash: keccak(r.as_raw()).into(),
};
This means that if a malicious miner included an uncle with a block timestamp of 584942419325-01-27 07:00:16 UTC
, or unix time 264
, then Geth would calculate the difficulty using unix time 0
while Parity would calculate the difficulty using unix time 264-1
. These two values would be different, so one of the two clients would have split from the canonical chain after failing to verify the block.
The Geth team fixed this bug in PR 19372, which switched all timestamps to use uint64
.
Every client participating in a consensus protocol must behave exactly identically, so what may seem like a completely benign operation might actually be the trigger which causes half of the network to disconnect. This also goes to show that you don't need to be highly technical in order to find impactful bugs, so if this seems like something you'd be interested in, there's no better way to get started than to dive right in.
Next time, we'll be exploring how Geth stores the data that makes up Ethereum, and how a skilled attacker could have planted a ticking time bomb that would hard fork the chain when detonated.
Disclaimer: This post is for general information purposes only. It does not constitute investment advice or a recommendation or solicitation to buy or sell any investment and should not be used in the evaluation of the merits of making any investment decision. It should not be relied upon for accounting, legal or tax advice or investment recommendations. This post reflects the current opinions of the authors and is not made on behalf of Paradigm or its affiliates and does not necessarily reflect the opinions of Paradigm, its affiliates or individuals associated with Paradigm. The opinions reflected herein are subject to change without being updated.