Bitcoin Core and Protocol Developments – Pieter Wuille on The Chaincode Podcast (Part I)

Check out The Chaincode Podcast Episode Page & Show Notes

Key Takeaways

Intro

  • Pieter Wuille (@pwuille) has been a Bitcoin protocol developer and contributor to Bitcoin Core since 2011
  • Hosts: John Newbery (@jfnewbery) & Jonas (@adamcjonas)
    • Chaincode is a technical podcast by Chaincode Labs to help on-board new contributors to Bitcoin

A Bit About Pieter

Headers-First Syncing (3:30)

  • A full node needs to download and validate the blockchain before it can validate recent transactions and blocks
  • IBD was historically done via a getblocks message
    • Your node is basically saying, “I know about this block hash; fetch me more blocks”
    • Getblocks works fine when connected to one peer, but doesn’t parallelize well to multiple connections; why?
      • Each block header references the hash of the preceding block—validation requires sending blocks in order
      • Not receiving a parent block creates orphan blocks, which are stored in memory until validated
      • Pieter describes the problem: “It’s a mess because you don’t know where you’re going to start off at the beginning and ask ‘Hey, what’s next?’ And there’s huge attack potential because a malicious peer could say, ‘Trust me. I have a good chain for you, and in the end, it’ll have high difficulty,’ then just keep sending you low difficulty blocks”
  • Headers-first sync splits the sync process into two parallel steps; here’s how it works
    • First, download all the block headers of the longest chain—this takes just minutes
    • Once you have the headers chain, you can download the block data from multiple peers in parallel
      • (Knowing how all the blocks are related eliminates orphan blocks)
  • The headers are cheap to verify but expensive to create—this improves network security
  • Load is spread among peers by limiting requests to 16 blocks at a time per peer

Finding Fast Peers During IBD (13:00)

  • Bitcoin software only downloads blocks that are within 1,024 blocks from its current tip (this creates a forward-moving download window)
    • The lowest-height block in the window is validated next
      • The software disconnects any peer that prevents the window from moving forward (this eliminates the slowest peer)
  • There are heuristics to optimize for best peers:
    • Don’t kick the last peer to give you a block
    • Preference peers from a variety of network sources
  • Checks were added in version 0.17 to confirm peers are on the same chain (following the same consensus rules)
    • This was done to prevent islands of nodes from following different consensus rules than the rest of the network without realizing it (this is especially important with Segwit2X and Bcash Hard-Forks)

Bitcoin Test Evolution (18:11)

  • On development culture at Bitcoin Core: “It’s certainly become harder. Bitcoin Core had no tests at the time when I first started contributing. Testing meant manual testing: ‘Hey I tried to sync and it still works’. There were no unit or functional tests.” – Pieter Wuille
  • Matt Corallo’s pull-tester was a bot that tested pull requests
    • The bot implemented a test in BitcoinJ that simulated events like re-orgs to see if Bitcoin Core would follow the right path under different scenarios
  • Bitcoin repository now has over 130 functional tests

Ultraprune (21:55)

  • Ultraprune was deployed in version 0.8—it introduced the concept of an ‘explicit UTXO set’ to Bitcoin’s validation logic
    • Unspent Transaction Outputs (UTXO) refers to unspent coins that can be used as an input for a new transaction
  • Before Ultraprune, detailed records of every transaction output ever created (unspent or spent) were stored in an index—this used 12 bytes per output and resulted in a bloated database of several Gigabytes
  • Since transactions are valid only if using UTXOs, a patch was developed to delete all transaction IDs with spent output (they weren’t needed to validate new transactions)
  • Ultraprune started as a proof-of-concept with the goal of trimming the database size as much as possible
    • The result: the database size was cut from several GBs to MBs (which translates to several hundred GBs in today’s blockchain size)
    • The blockchain is an ever-growing—append-only—database; the UTXO dataset is correlated with actual usage

March 2013 Consensus Fork (26:50)

  • Version 0.8 saw the introduction of Ultraprune and a database switch from Berkley DB (BDB) to LevelDB; here’s how the consensus fork happened
    • The BDB database used in the transaction index in version 0.7 required users to configure the number of lock objects that were needed (BDB documentation recommends running the DB under a stressful load to determine this value)
      • But, predicting the absolute maximum locks needed is impossible—it’s correlated with the number of pages in the database that are simultaneously affected by a single atomic transaction
        • (An atomic transaction is a database operation that succeeds as a whole or fails as a whole—it’s never partially completed. This has nothing to do with Bitcoin transactions.)
      • Applying a block to the database was done as one atomic update (the block is either validated –> added or failed –> destroyed)
    • Version 0.8 was much faster than 0.7, so miners immediately upgraded before the rest of the network, but:
      • Version 0.7 considered a block invalid if there was a failure in grabbing a lock (this occurred if a block required too many locks)
      • Version 0.8 changed the database model and switched to LevelDB, which has no locking (it’s a cross-process DB system)
    • On March 11, 2013, someone produced a block that exceeded the number of locks needed
      • The majority of the network running 0.7 rejected the block, but the miner majority running 0.8 were finethis meant a local block failure became a network consensus failure
  • This was non-deterministic across platforms and uncovered a pre-0.8 exploit
    • An attacker could selectively fork-off nodes by sending them blocks that trigger the locking behavior
  • Miners quickly agreed to temporarily downgrade to 0.7 and revert the chain
    • The software was fixed and released as 0.8.1, which had a rule simulating locks that would expire in 2 months, giving everyone a chance to upgrade
  • In August 2013, a block was produced that would have triggered another incident, but the software worked fine
CryptoChaincode Podcast : , , , , ,

More Notes on these topics