September 25, 2018
Code & Design
It’s been but half a century since the father of the Information Age, Claude Shannon, published the now industry-revered A Mathematical Theory of Communication dissertation. That’s the name of the declassified version published publicly by the then mid-30s mathematician in 1949. The previously classified version, however, was a war-effort published by the prestigious Bell Labs named “A Mathematical Theory of Cryptography." Many of the core principles that were published in the popular theory of communication stemmed from the secretive theory of cryptography. In fact, Shannon famously said the following regarding the intrinsic & overlapping properties of information communication theory & cryptography:
They were so close together you couldn’t separate them.
While the majority of this article will focus on what came after his “Mathematical Theory of Communication” thesis, in order to understand a certain standard, it’s imperative we go a decade back in Shannon’s career — to when he was a 28-year old graduate student at MIT. Pursuing a masters in electrical engineering, Shannon found himself working on a room-scale differential analyzer (early version of a computer); his main task was designing new electrical circuits for this computer. However, at this point in time, circuit building was still a relatively futuristic phenomenon — the very best scientists & engineers building circuits considered it an “art” due to the manual brute force try-&-fail approach primarily used.
Shannon, a mathematician at heart, recalled the abstract boolean math he learned in his undergraduate studies at the University of Michigan. Boolean math, as you probably guessed, is a branch of math that deals with true & false statements (or 0s & 1s). Boolean math, while fascinating, had few widespread applications in the mid-30s; on the other hand, electric circuit design, a modern scientific breakthrough, desperately needed a disciplined framework for further understanding.
In 1938, Shannon published his master thesis: A Symbolic Analysis of Relay & Switching Circuits. This genius thesis proved that using Boolean algebra, one could conceptually automate the arrangement of the relays in-then manual telephone exchanges. By extension, this meant that utilizing the binary properties of electric switches as logic functions, one could use boolean algebra to represent & solve any circuit designs.
This basic framework of circuit building currently underlies all modern digital computer hardware.
But again, this master thesis was far from his most ground-breaking scientific contribution. It was he who first began deeply exploring the role of binary values (0s & 1s), communication, computation, & by no exaggerated extension, cryptography. In fact, a decade after his initial master thesis, while crafting his piece de resistance communication & cryptography theory deep in the Bell Lab, he finally decided to name what he believed was the basic unit of all information: a binary digit, or, a bit.
And so sometime during the years that Shannons brilliance spanned across scientific information communication & war-time cryptography (1944–1949), the bit became the standard unit of information for all computing. Computers strictly understand 0s & 1s…so the question follows, how do we go from binary code to, say, the very same alphanumeric characters you’re reading on this screen?
A single bit is only ever a zero or a one — it has only two possible states[0,1]. For two bits we get a total of four possibilities: [00, 01, 10, 11].
Following this pattern, it becomes fairly obvious that for every n bits we have 2^n possible states.
Eventually, the need for more symbols & letters, in order to make working with computers more developer-friendly, came to the forefront of computer scientists gaze: how does one build a number system, let alone an entire alphabet, from only 0s & 1s?
If you’ve ever had to customize a color online you’ve probably come across a hexadecimal string at one point or another — they usually look something like the following: #012f5b
Designers are very familiar with this numbering system because it’s the standard way to notate colors digitally. The core rule of the hexadecimal numbering system is that every character is represented by strictly one of the following sixteen values: 0–9 & A — F. The first ten integers (counting zero) plus the first six letters of the English alphabet make up the entirety of the hexadecimal numbering system. Again, a total of sixteen (16) total possible states; another way of writing 16 is 2⁴. How could we represent these possible states?
With a total of four bits: 4 bits = 2⁴ possible states
Single digit integers & the first six letters of the English alphabet are certainly a step towards a more-friendly computer language — but is it enough? How would we, for example, denote a space? differentiate between lowercase & uppercase? Or use punctuation like an exclamation point or a question mark? No, sixteen characters wouldn’t do.
So in the early 60s, one Bob Bemer from IBM proposed a single code standard to the scientific committee, the American National Standards Institute (ANSI). Years of iterations later, President Lyndon B. Johnson signed a memorandum adopting ASCII as the standard communication language for all federal computers. Considering that the internet was in its infancy during that time period, it follows that ASCII (the name of the new standard), became ubiquitous with the spread of the internet.
The original version of ASCII proposed a seven-bit system; however, shortly afterwards, it become standard to use an extension (or derivative) version of the ASCII that called for an eight-bit standard. This standard meant that any human-readable character output by a computer could be represented by eight bits, which would translate to 2⁸ = 256 possible states! This eight-bit to alphanumeric character standard is best summarized by the table below:
We’ve now covered the birth & pragmatism of computing with, as well as defining, bits. From there we explained how four bits (2⁴) give us our hexadecimal system & how eight bits (2⁸) give us our still-in-use extended ASCII language. We’re now going to introduce a final principle that’ll hopefully make it clear why understanding the fundamentals of bits is crucial to a thorough understanding of cryptography & by extension cryptocurrencies.
Eight bits (2⁸) is actually a super important number in not just cryptography & cryptocurrencies but in all computing. In fact, eight bits are so standard that they were given a new name to symbolize a eight-bit string: a byte. A byte is a string of eight bits: 8 bits = 1 byte.
The fact that bytes can represent a single character is a key reason why factors of eight are extremely common numbers in cryptography, such as 128, & 256 (from the famous Bitcoin consensus hashing algorithm SHA256). Intuitively understanding how to go from bits, to hexadecimal values to alphanumeric characters to bytes is going to be a core part of needed knowledge moving forward to really understanding the driving forces behind cryptocurrencies. If you’re feeling overwhelmed, don’t worry, that’s perfectly natural when breaching such complex topics.
References
A Mind At Play: How Claude Shannon Invented The Information Age