Secure Hash Algorithms

Hash functions

Hash functions use compression algorithms to create fixed-length digests for any arbitrarily long strings. The digests are also called ‘fingerprints’ or ‘signatures’ [@wagnerWhatSHA2562020]. The outputs are unique (1-2-1 relationship) for every different input text (collision resistance) and are mathematically irreversible, meaning that one should never be able to find an input that hashes to the same digest of another known input (second pre-image resistance). In addition, it is impossible to guess the original text from a given digest (pre-image resistance). If you change a single letter in the entire ‘Lord of the Rings series, the digest for the books will be drastically different. This is a property of good hash functions - called the avalanche effect.

Hash functions are often used for digital signatures, MACs and data integrity services, such as P2P file sharing and verification, virus fingerprinting, Merkle trees and distributed hash tables [@bashirMasteringBlockchainInner2023]. They are essential in constructing other cryptographic primitives. Another well-known use case for hash functions is to store passwords (one-way hash). However, the requirements are very different for the two use case types. The former requires the algorithm to be fast so large amounts of text can be processed instantaneously. The latter needs the algorithms to be slow enough to deter brutal password-guessing attacks.

SHA-2

As of 2023, SHA-2 is the internet’s most commonly used hash function, providing four different versions that produce 224 (SHA-224), 256 (SHA-256), 384 (SHA-384) and 512 (SHA-512) byte outputs. Older generations of hash functions, such as MD (MD4, MD5) and SHA1, have been badly broken and should not be used anymore.

Other popular hash algorithms include RIPEMD and Whirlpool.

How Does SHA-2 Work?

First, the input message is divided into equal-sized blocks. The blocks are then compressed in multiple rounds on a block-by-block basis to produce a compressed output. This mechanism is known as the Merkle-Damgard Construction. For example, SHA-256 has an input message size limit of $2^{64} - 1$ . with a block size of 512 bits, a word limit of 32 bits and an output of 256 bits.

SHA-2 uses the Davies-Meyer construction as its compression function.

The algorithm works as follows:

Padding the message to multiples of 512 bits if it is shorter than the required block size or cannot be divided equally
Divide the message into blocks of 512 bits
Setting up the initial hash value. It consists of eight 32-bit words obtained by taking the first 32 bits of the fractional parts of the square roots of the first eight prime numbers.
Iterately apply the compression function to the message blocks, using the output from the previous compression function call as the second argument to the compression function for the next block

You can use openssl command line tool to see SHA-2 in action:

echo -n "hello" | openssl dgst --sha256
 
openssl dgst --sha256 ./input.txt

SHA-3

SHA-3 (a revised version of Keccak) is the latest standard in the SHA family. SHA 3 uses a fundamentally different structure (Sponge construction) from SHA-2. Unlike SHA-2, it is not susceptible to length-extension attacks and is suitable for hashing secrets. It is the current recommended Hash Function from NIST.

References

Bashir, I. (2023) Mastering Blockchain: Inner workings of blockchain, from cryptography and decentralized identities, to DeFi, NFTs and Web3. Packt Publishing Ltd.
Wagner, L. (2020) What is SHA-256? https://blog.boot.dev/cryptography/how-sha-2-works-step-by-step-sha-256/.

Liwen's Notes

Explorer

Recent Notes

How to Read Effectively

LangChain

Mixture of Expert (MoE)

The `NextRequest` and `NextResponse` Objects

LLM Explainability & Interpretability

Accountability & Motivation

Reasoning Models

Gen AI: The Future Is Agentic

Secure Hash Algorithms

Hash functions

SHA-2

How Does SHA-2 Work?

SHA-3

References

Graph View

Table of Contents

Backlinks