Let’s take a look at the following table:
|Customer||Purchase date||Product name||Amount||Price||Total price|
|Joe Smith||2014-02-14||Yoga mat||1||80||80|
|Jane Bauer||2014-02-16||Yoga block||2||30||60|
|Joe Smith||2014-02-14||Yoga block||2||30||60|
|Joe Smith||2014-02-14||Yoga strap||1||10||10|
|Thomas Apple||2014-02-18||Dumbbells 2kg||2||30||60|
|Jane Bauer||2014-02-16||Yoga mat||1||80||80|
What’s wrong with this table? It’s difficult to modify data in it. Upon modification, several anomalies can occur:
It’s impossible to insert a product into the table if the product hasn’t been bought by a customer yet. Similarly, it’s impossible to insert a customer who hasn’t made a purchase yet.
It’s difficult to update data in the table. If you want to change the name of the product, you have to update all rows where the product is bought. You cannot change the price of the product for all future purchases.
If you delete the Thomas Apple purchase (say, because the order was cancelled), you will also delete the product “Dumbbells 2kg.”
How do you deal with tables like this? You have to normalize them! A detailed post on data normalization will appear soon. Stay tuned!