0
0
Cybersecurityknowledge~15 mins

Hashing algorithms (SHA, MD5) in Cybersecurity - Deep Dive

Choose your learning style9 modes available
Overview - Hashing algorithms (SHA, MD5)
What is it?
Hashing algorithms are special methods that turn any input data, like a password or a file, into a fixed-size string of characters called a hash. This hash looks random but is always the same length no matter the input size. SHA and MD5 are two popular hashing algorithms used to create these hashes quickly and securely. They help verify data integrity and protect sensitive information without revealing the original data.
Why it matters
Hashing algorithms exist to ensure data has not been changed or tampered with and to protect passwords and other sensitive data safely. Without hashing, anyone could easily see or alter private information, leading to security breaches and loss of trust. For example, websites use hashing to store passwords so even if hackers get access, they cannot see the actual passwords. This keeps our online accounts safer.
Where it fits
Before learning hashing algorithms, you should understand basic data security concepts like encryption and data integrity. After mastering hashing, you can explore digital signatures, cryptographic protocols, and password management techniques. Hashing is a foundational tool in cybersecurity and data protection.
Mental Model
Core Idea
A hashing algorithm transforms any input into a unique, fixed-size string that acts like a digital fingerprint, making it easy to check data integrity without revealing the original content.
Think of it like...
It's like pressing a leaf onto a piece of paper to create a leaf print; the print is unique to that leaf and always the same size, but you can't recreate the leaf from the print alone.
Input Data ──▶ [Hashing Algorithm] ──▶ Fixed-size Hash String

┌───────────────┐      ┌─────────────────────┐      ┌───────────────┐
│ Any size data │─────▶│ SHA or MD5 function │─────▶│ 128 or 256-bit  │
│ (password,    │      │ (processes input)   │      │ hash output    │
│ file, message)│      └─────────────────────┘      └───────────────┘
Build-Up - 7 Steps
1
FoundationWhat is a Hash Function?
🤔
Concept: Introduces the basic idea of a hash function as a tool that converts data into a fixed-size string.
A hash function takes any input data and produces a short, fixed-length string called a hash. This hash looks random but is always the same length no matter how big or small the input is. For example, a password like 'mypassword' might become '5f4dcc3b5aa765d61d8327deb882cf99' using MD5.
Result
You get a unique, fixed-size string that represents the original data.
Understanding that hashing creates a consistent digital fingerprint helps you see why it’s useful for verifying data without storing the original.
2
FoundationProperties of Good Hash Functions
🤔
Concept: Explains the key qualities that make a hash function secure and reliable.
Good hash functions have these properties: 1) Deterministic: same input always gives same hash; 2) Fast to compute; 3) Pre-image resistance: hard to find original input from hash; 4) Collision resistance: hard to find two inputs with same hash; 5) Avalanche effect: small input changes cause big hash changes.
Result
You know what makes a hash function trustworthy and why these properties matter for security.
Knowing these properties helps you evaluate if a hashing algorithm is safe to use in real-world applications.
3
IntermediateUnderstanding MD5 Hash Algorithm
🤔Before reading on: do you think MD5 is still safe to use for password storage? Commit to your answer.
Concept: Introduces MD5, its design, and its current security status.
MD5 was one of the first widely used hashing algorithms. It produces a 128-bit hash and is very fast. However, over time, researchers found ways to create collisions—different inputs producing the same hash—making MD5 insecure for protecting passwords or verifying critical data.
Result
You learn that MD5 is fast but no longer secure for sensitive uses.
Understanding MD5’s weaknesses prevents you from using outdated methods that could expose data to attackers.
4
IntermediateExploring SHA Family Algorithms
🤔Before reading on: which do you think is stronger, SHA-1 or SHA-256? Commit to your answer.
Concept: Explains the SHA family, focusing on SHA-1 and SHA-256 differences and improvements.
SHA stands for Secure Hash Algorithm. SHA-1 produces a 160-bit hash but is now considered weak due to collision attacks. SHA-256, part of the SHA-2 family, produces a 256-bit hash and is much stronger. It’s widely used today for secure hashing in many systems.
Result
You understand why SHA-256 is preferred over SHA-1 and MD5 for security.
Knowing the evolution of SHA algorithms helps you choose the right one for protecting data.
5
IntermediateHow Hashing Ensures Data Integrity
🤔
Concept: Shows how hashing helps detect if data has been changed or corrupted.
When you download a file, the website often provides a hash value. After downloading, you hash the file yourself and compare it to the provided hash. If they match, the file is intact. If not, the file was altered or corrupted during transfer.
Result
You can verify data integrity quickly and reliably using hashes.
Understanding this practical use of hashing connects theory to everyday cybersecurity tasks.
6
AdvancedSalting Hashes for Password Security
🤔Before reading on: do you think hashing alone is enough to protect passwords? Commit to your answer.
Concept: Introduces the concept of adding random data (salt) to passwords before hashing to improve security.
Salting means adding a unique random string to each password before hashing it. This prevents attackers from using precomputed tables (rainbow tables) to reverse hashes. Even if two users have the same password, their salted hashes will differ, making attacks much harder.
Result
You learn how salting strengthens password storage beyond simple hashing.
Knowing about salting helps you understand modern best practices for protecting user credentials.
7
ExpertCollision Attacks and Their Impact
🤔Before reading on: can two different inputs produce the same hash? Commit to yes or no.
Concept: Explains how attackers exploit collisions to break hash security and why this is critical.
A collision happens when two different inputs create the same hash. Attackers can use this to trick systems into accepting fake data as real. For example, if a malicious file has the same hash as a trusted one, it might bypass security checks. This is why collision resistance is vital and why MD5 and SHA-1 are no longer recommended.
Result
You understand the real-world dangers of collisions and why strong hash functions are necessary.
Recognizing collision attacks clarifies why cryptographic standards evolve and why security is never static.
Under the Hood
Hashing algorithms process input data in fixed-size blocks through multiple rounds of mathematical operations like bitwise shifts, modular additions, and logical functions. These steps mix and scramble the input bits to produce a fixed-length output that appears random. The design ensures that even a tiny change in input drastically changes the output hash, making it infeasible to reverse or find collisions easily.
Why designed this way?
Hash functions were designed to be fast and deterministic while resisting reverse engineering and collisions. Early algorithms like MD5 prioritized speed but later showed weaknesses. Newer designs like SHA-2 balance speed with stronger security by using more complex operations and longer hash sizes. The tradeoff is between performance and security, evolving as attackers find new vulnerabilities.
Input Data ──▶ [Block Processing] ──▶ [Rounds of Mixing]
   │                  │                     │
   ▼                  ▼                     ▼
┌─────────┐      ┌─────────────┐      ┌──────────────┐
│ Split   │─────▶│ Bitwise and │─────▶│ Final Hash   │
│ into    │      │ modular ops │      │ Output       │
│ blocks  │      └─────────────┘      └──────────────┘
Myth Busters - 4 Common Misconceptions
Quick: Is MD5 still safe for storing passwords? Commit to yes or no.
Common Belief:MD5 is secure enough for password storage because it produces a unique hash.
Tap to reveal reality
Reality:MD5 is vulnerable to collision and brute-force attacks, making it unsafe for password storage.
Why it matters:Using MD5 for passwords can lead to easy cracking and data breaches.
Quick: Does hashing encrypt data so it can be decrypted later? Commit to yes or no.
Common Belief:Hashing encrypts data and can be reversed to get the original input.
Tap to reveal reality
Reality:Hashing is one-way and cannot be reversed to reveal the original data.
Why it matters:Confusing hashing with encryption can lead to wrong security designs and false expectations.
Quick: Can two different inputs never produce the same hash? Commit to yes or no.
Common Belief:Hash functions always produce unique hashes for different inputs.
Tap to reveal reality
Reality:Due to fixed hash size, collisions are possible, though good algorithms make them extremely rare.
Why it matters:Ignoring collisions risks trusting insecure hashes, leading to potential attacks.
Quick: Is salting unnecessary if you use a strong hash like SHA-256? Commit to yes or no.
Common Belief:Strong hashes alone are enough; salting is optional.
Tap to reveal reality
Reality:Salting is essential even with strong hashes to prevent precomputed attacks and ensure unique hashes.
Why it matters:Skipping salting weakens password security and makes systems vulnerable to rainbow table attacks.
Expert Zone
1
Some hashing algorithms like SHA-3 use completely different internal structures (sponge construction) compared to SHA-2, offering resistance to different attack types.
2
Performance of hashing algorithms can vary greatly depending on hardware; some are optimized for CPUs, others for GPUs or specialized chips, affecting their practical security.
3
Hash length impacts security: longer hashes reduce collision chances but increase storage and computation, so choosing the right balance is critical.
When NOT to use
Hashing is not suitable when you need to recover original data; encryption should be used instead. Also, for password storage, use specialized algorithms like bcrypt or Argon2 that include salting and are designed to be slow to resist brute-force attacks.
Production Patterns
In real systems, hashes are combined with salts and stored securely for passwords. Digital signatures use hashes to verify message integrity. File verification tools use hashes to detect corruption. Blockchain technology relies heavily on hashing to link blocks securely.
Connections
Encryption
Complementary security techniques
Understanding hashing alongside encryption clarifies when to use one-way data verification versus reversible data protection.
Digital Signatures
Builds on hashing for integrity checks
Knowing how hashes create unique fingerprints helps grasp how digital signatures verify authenticity and prevent tampering.
Biometrics
Similar concept of unique identifiers
Hashing’s idea of unique fixed-size outputs parallels how biometric systems use unique physical traits to identify individuals securely.
Common Pitfalls
#1Using MD5 to store user passwords.
Wrong approach:hashed_password = md5(user_password)
Correct approach:hashed_password = bcrypt(user_password + salt)
Root cause:Believing MD5 is secure enough without understanding its vulnerabilities and the need for salting and slow hashing.
#2Assuming hashing encrypts data and can be reversed.
Wrong approach:original_data = decrypt(hash_value)
Correct approach:Store original data securely or use encryption if reversibility is needed; hashing is one-way.
Root cause:Confusing hashing with encryption due to similar goals of data protection.
#3Not using salt when hashing passwords.
Wrong approach:hashed_password = sha256(user_password)
Correct approach:hashed_password = sha256(user_password + unique_salt)
Root cause:Underestimating the risk of rainbow table attacks and the importance of unique salts.
Key Takeaways
Hashing algorithms convert data into fixed-size strings that act like digital fingerprints, ensuring data integrity and security.
MD5 is outdated and insecure; modern systems use stronger algorithms like SHA-256 combined with salting for password protection.
Hashing is one-way and cannot be reversed, unlike encryption which is designed to be reversible.
Salting hashes is essential to prevent attackers from using precomputed tables to crack passwords.
Understanding collisions and their risks is critical to choosing the right hashing algorithm for security.