What Is a Hash?
Hash
Definition
A hash is the output of a hash function. A hash function is an algorithm that takes a piece of data and converts it into another piece of data. Hash functions are widely used in cryptography and cryptocurrencies.
A hash function is a mathematical procedure that converts data into another form. To use a hash function, you take some original data (a string of numbers or letters, for example) and perform a set of mathematical operations on it. The process will produce another set of data, and this is often referred to as the “hash” of the original data.
Hashes have some special qualities that make them very useful in cryptography. Hashes produced by a particular hash function are always the same length, irrelevant of the size of the original data. In addition, you can’t “reverse-engineer” a hash: it’s impossible to get back to the original data if you only know the hash.
How a hash function works
Hash functions take inputs and produce outputs. An input can be any type of data: a string of numbers, for example, or a piece of text. Since computers encode all data as a string of binary numbers (1s and 0s), any data or file that is used on a computer system can form the input for a hash function.
A hash function takes this input and performs a series of mathematical operations on it. These operations are together known as an “alogrithm” — a set of instructions that tells a computer how to carry out the hash function.
Applying a hash function to a particular piece of data will produce another set of data. This is known as the output of the hash function, and is often just called a “hash” of the original data.
For example: a string of numbers such as “123456” can be an input for a hash function. A hash function will take this original string of numbers, perform a series of mathematical operations on it, and provide you with an output. The output might look completely different from the input — it might be “$H39n2” — but it is related to the input in a number of special ways that make it very useful in cryptography.
Consistent size
The output of a hash function is always the same size, no matter what the size of the original data. This feature makes hash functions convenient to work with for computer programmers.
For example, if you took two strings of letters of different lengths – “hello” and “I’m a hash function” – and put them both through a hash function, the outputs for each would be different, but they would be the same length.
Comparability
A second important feature of hash functions is that the same input will always produce the same output. Though the output of a hash function will look nothing like the original data, if you apply a hash function to two identical files, the outputs will also be identical.
This feature allows the two original files to be compared, even if you don’t know what they contain.
For example: you might want to check that a database you hold is the same as a database held by another person, but without sharing the actual information it contains. If you both perform a hash function on your copies of the database, and the outputs are the same, you can be assured that the original files are identical.
Sometimes, there are exceptions to this rule, when a hash function will produce the same output from two different inputs. This is known as a “collision.” High-quality has functions are designed to avoid collisions, but no hash function can completely avoid them.
One-way
The third important feature of the output of hash functions is that they only work one-way: You can’t recreate the original data from the output, even if you know which has function was used to produce it.
A good analogy for this is baking a cake. You can’t get the eggs, flour, and sugar back out of a cake after they have been combined. In a similar way, you can’t run a has function “backward,” and recreate the input from the output.
This feature makes has functions very useful in cryptography, because it means that data can be hidden by applying a has function to it. As a result, hash functions are used extensively in online security — from protecting passwords to detecting data breaches to checking the integrity of a downloaded file.
Hashing in cryptocurrencies
The most common place you’ll encounter hash functions — unless you work in cybersecurity — is in relation to cryptocurrencies.
Cryptocurrencies rely on a type of computer system known as a “blockchain.” All of the transactions made with a particular cryptocurrency — that is, every time someone sells or buys a cryptocurrency “coin” — are stored in a database known as a “ledger.”
What differentiates this ledger from the more traditional type held by bank is that it is distributed. Rather than one central authoritative copy of the ledger, every computer connected to the network holds its own copy.
Because of this, it’s vitally important that every copy of the ledger is identical. If they aren’t, this could open the blockchain to abuse — fraudulent transactions or double-spending of the currency. At the same time, however, it’s important that the transactions made with the currency are anonymous, so no-one can share the actual contents of their copy of the ledger.
Hashing provides a way around this issue. Each user of a cryptocurrency can produce a hash of their copy of the ledger, and these can be compared. If they are the same, then the original contents of their ledger must also be identical. This allows the integrity of the blockchain to be checked without breaking the anonymity of its participants.
Examples of Hashing
Hashes used for much more than just cryptocurrencies. In fact, they are used very widely in contemporary computer systems.
One example of this is keeping passwords secure. When you make a new password for an online account, the password you enter isn’t actually sent to the company you are signing up for the account with. Instead, the password system generates a hash of your password, and sends the hash to the company’s computers for storage.
When you log in again, the system will again generate a hash of the password you enter, which is again sent to the company. Their system will compare this hash with the one generated when you made the password. If they match, then they can be sure that you’ve entered the same password again.
Without a has function, it would be necessary to send your actual password over the internet, and this represents a security risk. Anyone “listening in” on your connection would be able to see your password. By using a hash function, all they will be able to see is the hash you’ve generated, and it’s impossible to re-create your actual password from this.
Hash function key takeaways
Hash functions are algorithms that take a piece of data — a set of numbers or letters, for example — and convert it into another piece of data known as a “hash.”
This hash has some special properties that make it very useful in cryptography: a hash is always the same size, no matter the size of the original data; the same input will always produce the same output; and it is impossible to recover the original data from the output.
Because of these special features, hash functions are used widely in computer systems, including in cryptocurrencies.