database

Normalization vs Denormalization

Schema Mode

Interactive

info Normalized (3NF)

check_circle Write Optimized: Update "Delhi" once, it updates everywhere.
warning Read Heavy: CPU must connect 3 tables to show one order.
db_schema_visualizer.exe
USERS PK: user_id name: Rajat city: Delhi ORDERS PK: order_id FK: user_id amount: $500 PRODUCTS PK: prod_id FK: order_id name: Laptop DISK STORAGE (Simulated) ✓ No Duplicates
terminal Query Output Simulation
SELECT * FROM Orders
  JOIN Users ON Orders.uid = Users.id... // Slow

Core Definition

Normalization organizes data into multiple related tables to minimize redundancy.
Denormalization combines data into fewer tables to optimize read performance at the cost of redundancy.

Normalization
Tables: Many (↑)
Joins: Many (↑)
Storage: Low (↓)
Denormalization
Tables: Few (↓)
Joins: None (↓)
Storage: High (↑)

Normalization

dataset

Reduces data redundancy and inconsistency by dividing larger tables into smaller tables and linking them using relationships.

Benefits

  • check Eliminates duplicate data (Optimized Memory)
  • check Ensures data integrity & consistency

Drawbacks

  • close Complex queries requiring multiple Joins
  • close Slower read performance

Denormalization

table_view

Intentionally adds redundancy by combining tables. Used primarily to speed up read operations in complex systems.

Benefits

  • check Extremely fast read queries (No Joins)
  • check Simpler query logic

Drawbacks

  • close Wasted memory due to duplicate data
  • close High maintenance for updates (Data Integrity risk)

Feature Comparison

Normalization Denormalization
Stores non-redundant, consistent data Combines data for faster execution
Redundancy is minimized Redundancy is intentionally added
Maintains Data Integrity Integrity is harder to maintain
Optimizes Disk Space Consumes More Disk Space
Slower Reads (Joins needed) Faster Reads (Fewer/No Joins)
Best for: OLTP (Banking, Transactions) Best for: OLAP (Analytics, Reporting)

The "Clean Room" vs. "Fast Food"

Database design is a trade-off between organization and speed. Let's look at real-world examples.

The Organized Library
inventory_2

Normalization

In a library, every book has exactly one spot. If you need to correct a title, you do it once. However, finding all books by an author on a specific topic requires walking to different aisles (Joins).

Efficient Space Consistent
shopping_cart

Denormalization

Like a pre-packed grocery kit. It contains milk, eggs, and bread together. You grab one kit and go (Fast Read). But if milk expires, you have to check every single kit to replace it (Expensive Update).

Fast Retrieval Redundant Items
The "All-in-One" Bundle

Interviewer's Cheat Sheet

Master the "Why" and "When".

Insertion

Unable to add data because other dependent data is missing (e.g., Can't add a student without a course).

Update

Data inconsistency when updating. (e.g., Update address in one row but miss others).

Deletion

Accidental loss of data. (e.g., Deleting a course also deletes the only student record).

"Every fact must be about the key, the whole key, and nothing but the key."

Simple: If you have a 'Student' box, don't put the 'Teacher's Phone Number' in it just because the student has that teacher. The teacher gets their own box!
  • History: Keeping the price of an item at the time of purchase (Snapshot), so it doesn't change if the product price changes later.
  • Reporting (OLAP): When managers need dashboards that aggregate millions of rows instantly.
  • Performance: When a `JOIN` of 12 tables takes 10 seconds, but you need the page to load in 0.5 seconds.