Sunday, 9 September 2018

All about "sharding" Bitcoin

There has been a lot of talk lately about "sharding" bitcoin. I use the term in quote marks because the word sharding is way too overloaded. I hate that the word is used here, but what can you do? You can't force people to not use a term you don't like...

Anyways, I have professional experience designing computer systems so that can handle more load by splitting it's work up and sending it through many machines. This post will be my thoughts on how this can be done to BCH.

All computer programs have three basic phases:

  1. Read
  2. Process
  3. Write

These three phases roughly correspond to the more popular Model, View, Controller paradigm that is popular with computer science people. Controller = Reading, the Model = Process, and View = output.

The BCH block validation process has roughly the same three phases:

  1. Block is checked for UTXO validity (Read)
  2. Each digital signature is verified to be correct (Process)
  3. The newly validated block is applied to the existing UTXO database (Write)

To "shard" this process, you need to individually shard each phase.

To "shard" phase 1

You can only do this by installing new hardware. If the UTXO database is stored on a hard drive, the speed at which this database can be read from is determined by the read speed of the hard drive. No amount of coding can make a hard drive read 500 MB/sec if it's maximum read rate is 200 MB/sec. Even if you have multiple threads going, if they are all reading from the same drive, then the multiple threads won't make it any faster. On the other hand, if you have the database replicated across two different devices: maybe one copy of the UTXO database on drive A, and another copy on drive B, then you can get a benefit from having multiple threads. Thread A can handle the first half of the UTXOs and it uses Database A on drive A, while thread B can process the second half and it will use database B on drive B. This setup will double read throughput.

Imagine a guy sitting at a table with a huge book that contains a list of all UTXO's in the database. This man is handed a piece of paper containing a list of UTXO's that were spent in the latest block. This man's job is to make sure all the UTXO's listed on the piece of paper correspond to an entry in the UTXO book. This man's job requires only reading skills. There is no computation to this task. If this man's math skills are terrible it doesn't matter because his job only involves reading.

One man can only read so fast, so to speed this up, you can hire another person. This person has to have his own book, because if he has to share his book with the other guy, then they both get slowed down and it makes having two guys pointless. If both guys get their own book, they are able to work in parallel and throughput is doubled.

Now the question is, do we give each guy their own book, or do we take one book, and split it in half, so guy A gets the first half, and guy B gets the second half. This technique is usually what computer science people call "sharding". The act of each guy getting his own full copy of the book is usually called "replication". Is replication or sharding best to do with block validation? My opinion is that replication should be tried first, and then when the UTXO size gets too large to fit on a single device, then and only then should "sharding" (that is "splitting the book" instead of "copying the book"). (Now you know why I hate the word "sharding" used to describe the act of making bitcoin's block validation process parallelizable)

I don't think the UTXO database size will ever get so big that it can't fit on a single device, so read "sharding" should never need to be implemented. Replication is much simpler to implement too.

To "shard" phase 2

This is the easy part to parallelize. Most machines these days have multiple cores, and moving into the future, this trend is expected to continue. This is also the phase of the validation process that takes the most time. I don't have numbers to prove this assumption, but it's my intuition that tells me a simple database lookup is going to be much faster than the CPU intensive process of executing bitcoin script. Therefore, parallelization of this phase should be a higher priority than phase 1 or 3.

to "shard" phase 3

This phase should not be parallelized. Even if it takes a very long time to update the UTXO database after the block has been found to be valid, it doesn't matter much. My that time the mining has already moved on to the next block, which is the entire point of validating a block to a miner. The UTXO database is only needed when a block first comes in. In other words, the deadline for UTXO updates is (on average) 10 minutes after the previous block has been determined to be valid. The speed of writing to a database is, just like the speed of reading, is determined by the underlying hardware. Writing to a fast device will be fast, writing to a slow device will be slow. Optimizing code won't give you much speed up, only installing faster hardware will.

If you are using multiple read databases in phase 1, then you have to replicate phase 3 across multiple databases. But thats OK, because of the 10 minute deadline.

My advice to the people that want to make BCH able to validate using multiple resources, please start with phase 2. It'll give the most benefit based on the types of computers the BCH node software runs on. Once thats completed, next is to "shard" phase 1.

I imagine a button you click in the settings menu that is labeled "Add UTXO shard". When you click the button, it lists all your installed devices (local hard drives, networked drives, etc) and asks you to select the one you want the new UTXO shard to live in. Once you click on "OK", it'll copy the UTXO data to the new device, and it'll use it in a parallel thread with the other UTXO database to double throughput when validating new blocks.

Phase 3 sharding will probably never be needed, but it'll have to be implemented at the same time phase 1 is implemented. It's super easy to implement anyways, you just make sure the code that handles updating across multiple devices doesn't do it in sequence (it has to be done simultaneously in threads)

submitted by /u/freework
[link] [comments]

source https://www.reddit.com/r/btc/comments/9e60ps/all_about_sharding_bitcoin/

No comments:

Post a Comment