A hash function takes any input — a word, a file, an entire database — and produces a fixed-length string of characters called a digest. Change a single character in the input and the entire digest changes completely. Feed in the same input twice and you always get the same output. That combination of properties makes hashing one of the most useful tools in computing.
What does a hash actually look like?
Take the word hello and run it through each algorithm:
| Algorithm | Output (hex) | Length |
|---|---|---|
| MD5 | 5d41402a bc4b2a76 b9719d91 1017c592 | 32 chars |
| SHA-1 | aaf4c61d dcc5e8a2 dabede0f 3b482cd9 aea9434d | 40 chars |
| SHA-256 | 2cf24dba 5fb0a30e 26e83b2a c5b9e29e 1b161e5c 1fa7425e 73043362 938b9824 | 64 chars |
| SHA-512 | 9b71d224… (much longer) | 128 chars |
Notice that even though hello is only 5 characters, the output length is always fixed regardless of input size — that is a defining property of hash functions.
The key properties of a good hash function
- Deterministic — same input always produces the same hash.
- One-way — you cannot reverse a hash back to the original input.
- Avalanche effect — changing one bit in the input changes roughly half the output bits.
- Collision resistant — it should be practically impossible to find two different inputs that produce the same hash.
MD5 — fast but broken
MD5 produces a 128-bit (32-character hex) digest. It was designed in 1991 and was once the standard for checksums and password hashing. Today it is considered cryptographically broken — researchers can generate deliberate collisions in seconds on consumer hardware.
SHA-1 — deprecated, avoid where possible
SHA-1 produces a 160-bit (40-character hex) digest. It was the successor to MD5 and used widely in SSL certificates and Git commits. In 2017 Google's research team produced the first real-world SHA-1 collision (the SHAttered attack). Major browsers and certificate authorities have since dropped support.
SHA-256 — the current standard
SHA-256 is part of the SHA-2 family and produces a 256-bit (64-character hex) digest. No practical collision attack exists against it. It is used in TLS certificates, code signing, Bitcoin's proof-of-work, and most modern security systems.
SHA-512 — extra strength
SHA-512 produces a 512-bit (128-character hex) digest. It is not significantly more secure than SHA-256 against current threats, but it can actually be faster on 64-bit CPUs because the algorithm works with 64-bit word operations internally. It is preferred in some high-security contexts.
Quick reference: which to use
| Algorithm | Bit length | Security | Good for |
|---|---|---|---|
| MD5 | 128 | Broken | Checksums, cache keys, non-security deduplication |
| SHA-1 | 160 | Deprecated | Legacy systems, Git internals (non-security) |
| SHA-256 | 256 | Secure | Certificates, code signing, password hashing, APIs |
| SHA-512 | 512 | Secure | High-security contexts, 64-bit performance-sensitive systems |
Hashing is not encryption
A common misconception: hashing and encryption are not the same thing. Encryption is reversible — given the key, you can decrypt the data back. Hashing is one-way — there is no key, no reverse operation. This is why passwords should be hashed (stored as a hash) rather than encrypted: even if the database is breached, the original passwords cannot be recovered from the hashes.
File integrity verification
One of the most practical everyday uses of hashing is verifying that a downloaded file arrived intact. When you download software, the publisher often lists the SHA-256 hash of the file. After downloading, you run the same hash function on the file and compare the result. If they match, the file was not tampered with or corrupted in transit.