How Blockchains Work

One of the biggest innovations in computing over the last several years has been the blockchain.

There have been a host of companies that have hyped products using a blockchain. Blockchains are the basis of all cryptocurrencies and non-fungible tokens.

Despite all the talk about blockchains, most people still aren’t totally sure what a blockchain is or how it works.

So, learn more about what blockchains are and how they work on this episode of Everything Everywhere Daily.

If you’ve paid attention to technology or computer news in the last few years, you’ve probably heard a lot about blockchains. Blockchains are the technology behind cryptocurrencies and non-fungible tokens or NFTs.

However, I’ve noticed that while there have been many mentions of blockchains, there hasn’t been a lot in the way of explaining what a blockchain actually does.

My goal in this episode is to provide a high-level understanding of what a blockchain is and how it works. I’m not going to get into the specifics of any particular use of blockchains. Also, I know that I have a wide range of people who listen to this show, so there will be some who are very familiar with what I’m talking about, and there are some who might not even know what a blockchain is.

This episode is intended for the latter group, who might have heard the term but otherwise don’t really know what it is.

With that, let’s start with what problem a blockchain is trying to solve.

Let’s say that I am going to allow people to vote on if I should do an episode on tomatoes. So, I create a Microsoft Word document and make it available for download.

A bunch of people download that Microsoft Word file and then vote in the Word file.

I’ve distributed the file, but there is an obvious problem. No one else, including myself, can see what changes you made. The file is distributed, but it isn’t connected. If everyone sent me their copy of the world document, no one else could see what the votes were. Some of the votes might get stuck in my spam folder. Even if I then reupload the results, there is no guarantee what I post is correct.

So, let’s solve that problem by creating a document in Google Docs that anyone can access online. Now, everyone can see and edit the same document, but there is now another problem.

Let’s say that someone out there really doesn’t want me to do an episode on tomatoes, so when people start adding their names to the pro-tomato list, they start deleting them.

This document is public and distributed, but it is inherently untrustworthy.

We could solve this problem by making me some sort of admin, where I can see all the edits that everyone makes.

So, again, someone votes on the tomato episode, but I secretly don’t want to do a tomato episode, so I edit the document to delete pro-tomato votes. The entire system has now been condensed into trust in a single person, me, who has a vested interest in the outcome.

So, to solve that problem, instead of me being the admin, we let some third-party admin at Google handle everything. However, the problem is still there. We now have to trust that person, and we don’t know if they are pro or anti-tomato.

Even if we removed all the admins and created some sort of automated system, we still have to deal with the issue of hackers hacking into a database to change the results.

This is a seemingly intractable problem. An episode about tomatoes is a trivial example, but you can see how this might be a huge problem if we are talking about money or other important data.

This problem of achieving consensus among a group of distributed and possibly faulty actors in the presence of unreliable communication channels is known as the Byzantine General problem.

For most things, a centralized database is fine, but any centralized system is also a centralized point of failure. Almost weekly, you will hear a news story about some company that had their data stolen by hackers.

A blockchain can solve many of these problems.

The ideas behind blockchains were first proposed in 1982 by the cryptographer David Chaum in his dissertation titled “Computer Systems Established, Maintained, and Trusted by Mutually Suspicious Groups.”

Further work was done in the early 90s by cryptographers with attempts to create time stamps that couldn’t be altered.

The first working blockchain system was developed in 2008 by an unknown person who is known only by the pseudonym Satoshi Nakamoto. His system was used to create the cryptocurrency known as Bitcoin.

However, soon after the release of Bitcoin, people realized that the blockchain technology that was at the foundation of it could be used for a wide variety of things.

The key to understanding blockchains has to do with something called a hash function. I’m not going to get into the weeds on the mathematics of it, but a hash function is a type of function known as a trap door or a one-way function.

They are very easy to calculate but very difficult to reverse.

In the case of a hash function, you can put anything into it, and what will come out is a number of a set length. One of the most popular hash functions is known as SHA-2. SHA stands for Secure Hash Algorithm, and it was created by the United States National Security Agency (NSA) in 2001.

There are different variants of it, but the most common is SHA256. That means whatever you put into the function, it can create a binary number that is 256 digits long or a 64-digit hexadecimal number in base-12.

For example, if you put in a single character, it will output a binary number of 256 bits. If you put in the complete works of Shakespeare, you still get a 256-bit number as an output.

A 256-bit binary number is really big. If you were to convert it into a number in base-10 like we are used to, it would be a number with 77 digits. That is a number so large that it is greater than all the atoms in our galaxy.

The thing about a hash function is that if you change even one character or digit in the input, you get a totally different output. The output doesn’t change by one digit. In theory, every digit could change.

Moreover, if you know the 256-bit output that is created by a hash function, it is almost impossible to know the input that created it. That is why it’s known as a trap door or one-way function. It is very easy to calculate and very hard to reverse.

In fact, the only known way to find out what the original input was on a hash function, assuming you know the output, would be to guess randomly. You’d probably win the lottery a billion times before you figure out the correct input to get the same output from a hash function.

So, what does this have to do with a blockchain?

The first part is the block.

Using the original example of people voting on tomato or not tomato, let’s assume that each vote will be a block. The first person voting will enter “tomato.” Entering tomato in the hash function will then create a very long string of numbers that is unique to tomato.

I’m now going to chain the second block to the first block, hence a blockchain.

The second block will be the output of the first block plus whatever the second vote is. This creates a new, very long number as the output for the second block.

Now, remember what I said if you change even one number or one character in the input of a hash function, you get a totally different output.

That means if anyone were to tamper with the first vote and try to change it to “not tomato,” it would radically change the output, which itself was part of the input of the second block.

Changing the input anywhere along the chain changes everything that comes after it because everything is chained together.

You can chain hundreds, thousands, or millions of blocks together, and you will, in the end, get a unique result. A result that can only appear if every link in the chain is valid.

Moreover, this verification of the chain can be done by anyone. There doesn’t have to be a single computer that does all of the computations. Because every link on the chain needs to be verified, in theory, everyone could have a copy of the blockchain and verify every step in the chain themselves using the same hash function.

Because every one using the same hash function can get the same result, and everyone can have a copy, this is known as a distributed public ledger. It is public, transparent, distributed, tamper-proof, and it requires no trust in a single individual.

The example I gave is a trivial one, but it illustrates the point.

A blockchain is a chain of blocks. A hash function creates each block with an output that is nearly mathematically impossible to tamper with without ruining the entire chain.

Real-life blockchains are more complicated than this. For starters, each block isn’t a single entry. It might be a collection of entries.

One issue with real blockchains, like with Bitcoin, is the creation of new blocks. Because multiple transactions are in a single block, creating a new block isn’t done willy-nilly.

Creating a new block is difficult. It is so difficult that the process is called mining. To create a new block you have to find a value that when run through the hash function, starts with a string of zeroes. How many zeroes will depend on the difficulty setting.

Because there is no way to predict what number will come out of a hash function, the only way to find such a number is just by testing random numbers at a massive scale. This is called hashing.

As of the time of this recording, there are about 424 million terrahashes conducted every second. That’s 424 million trillion hashes every second, trying to find a number that could start a new block. Even with all of these mathematical guesses, one new block is created about every 10 minutes.

This difficulty in creating new blocks is known as proof of work.

So, what is the point of all of this?

The biggest use today is for cryptocurrencies. The blockchain literally tracks every single transaction that has ever occurred. A few years ago, in an effort to learn more about all this bitcoin stuff I was hearing about, I started my own Bitcoin node on a tiny Raspberry Pi computer. When I first turned it on, it took a week to verify every single transaction in history, and the entire blockchain now takes up about 500 gigabytes of storage.

However, there are other uses. There is a service called PodPing that uses a blockchain to inform services when podcasts have been updated. It cuts down on the tremendous bandwidth costs that are incurred, constantly checking to see if a podcast has been updated.

Rather than checking the RSS feed for every single podcast over and over, you just need to follow the updates from a single blockchain to know when a show has been updated.

Blockchains allow non-fungible tokens to be assigned to a single user. Many of the current uses of NFTs have just been used for collectibles, but it also works for anything digital that you want to be something that can’t be infinitely copied.

Blockchains have been proposed for other types of data that need to be secured and potentially available anywhere. Real estate deeds are an example of something that could be stored in a blockchain to ensure records aren’t lost and are accessible by anyone. It could also potentially remove many of the intermediaries required for a real estate transaction.

Likewise, there are many things that don’t make sense to put on a blockchain. I’ve heard many proposals for things that really are just large databases that would require a large data center that would never work on a distributed blockchain.

What I’ve covered in this episode is a very cursory overview of blockchains and how they work.

If my explanation didn’t sufficiently explain how blockchains work, I suggest visiting blockchaindemo.org. There, they have a sample tool you can mess around with to see how the hash functions work when you input data and how different blocks can chain together.

Blockchains do have a lot of potential uses. It solves one of the trickiest problems with electronic communications: how you can verify data in a decentralized matter. But at the same time, it isn’t a panacea, and it isn’t designed to be used with everything.

The Executive Producer of Everything Everywhere Daily is Charles Daniel.

The associate producers are Peter Bennett and Cameron Kieffer.

Today’s review comes from listener Alesia from Denver over on Apple Podcasts in the United States. They write:

Great way to learn about a variety of topics.

Gary’s podcasts cover a wide variety of topics, from historic events to geographic facts to cultural heritage. In 15 minutes, you can get educated on a topic in-depth and in detail, and Gary’s voice is very pleasant. Love it!

Thanks, Alesia! I appreciate your kind words and I am always glad to hear from listeners up in the Mile High City. Denver is one of my favorite cities and I look forward to returning.

Remember, if you leave a review or send me a boostagram, you too can have it read on the show.