Database Normalization

This post is meant to be an introductions to database normalization. So let's start with the definition of what database normalization actually is from our friends over at Microsoft:

Normalization is the process of organizing data in a database. This includes creating tables and establishing relationships between those tables according to rules designed both to protect the data and to make the database more flexible by eliminating redundancy and inconsistent dependency.

By eliminating redundancy we're eliminating repeated data making our servers lighter and faster. There are currently fourt normal forms: First Normal Form (1NF), Second Normal Form (2NF), Third Normal Form (3NF), Boyce/Codd Normal Form (BCNF), Fourth Normal Form (4NF), and Fifth Normal Form (5NF). I would like to point out that your database is generally considered normalized once you've reached the the Third Normal Form so Ill just be going over the steps to get your table normalized to that point. When normalizing your database you must do so in order, which means that you must first reach 1NF before you can start normalizing your form into 2NF and so on.

First Normal Form:

  • Table must have a primary key.
  • Cannot store more then one value per column.
    1NF Example

Second Normal Form

  • Create separate tables for values that are related to more then one record.
  • Create a helper table with foreing keys in order to connect these tables.
    2NF Example

Third Normal Form

  • Remove columns that dont depend on the primary key.
    3NF Example

I have created a small applicaiton called SqlHelper that helps you create and visualize your data tables, and it will write the sql queries necessary to create those tables on MySql. I would like to point out that in order for this application to work your form must be in its First Normal Form and it will only normalize your data up to its Second Normal Form. The last step of removing the columns that don't depend on the primary key would have to be done by you.