Do you remember the post about update anomalies? I promised you we’d explain how to design tables which have no update anomalies. So here we go!
Today we begin a series of posts on data normalization. We will talk about functional dependencies, a concept that needs to be explained before we dive deeply into database schema normalization.
The subject is rather abstract and theoretical but I will try to restrain myself from going too deep into mathematics. I will try to keep things simple and down-to-earth. (The operative word being try ;) )
Simple Example: Person Table
The data in a table are usually not independent. The values in one column can determine the values in other columns. We’ll explain this using a simple example. Recall the
person table from our videos.
Example data in the table might be like this.
|SSN||First name||Last name||Date of birth||Address||Phone number|
Here, the value in the column
SSN (Social Security Number) determines the values in columns
phone_number. This means that if we had two rows with the same value in the
SSN column, then values in columns
phone_number would be equal. A person with SSN 123-98-1234 is always called Cindy Cry, is born on 15-05-1983, and so on. A situation like this is called functional dependency.
The notation for functional dependency:
ssn → first_name ssn → last_name ssn → date_of_birth ssn → address ssn → phone_number
In short, we might write it like this:
ssn → first name, last name, date_of_birth, address, phone_number
On the left hand side of the arrow we put the name of the column the other is dependent on. On the right hand side we put the name of the column that is dependent.
One thing is worth clarifying before we go on: a functional dependency is a constraint in the database schema, not in the data. You may notice that in our example all people named John are born on 30-01-1980. But it does not mean that we have a functional dependency:
first_name → date of birth
The constraint is not generally true for all data that might come into our table. It’s only a coincidence in our data. Coincidences don’t count. To find a functional dependency you only look for constraints that are generally true, for all possible data.
Another Simple Example: Car Table
One more example from our videos: the
car table. The obvious functional dependency is this:
vin → brand, model, production_year, price
vin determines the
A careful reader might ask: what about a dependency like this:
price → price
price column determines its own value? You’re right. A dependency like this is called trivial functional dependency. You usually omit trivial dependencies if you're writing up all functional dependencies in a table.
Example: Students Table
Not all functional dependencies have a single column in the left-hand side. Take a look at the
And some example data:
|Cindy Cry||6||Databases 101||Jack Magpie|
|John Novak||4||Databases 101||Margaret Beettle|
|Allan Smith||6||Algorithms||Paul Reason|
The student determines the semester they’re currently at, so we have a functional dependency.
student → semester
But there is one more functional dependency here. Student and lecture together determine the name of the teaching assistant tutoring the student. The lecture only is not enough: some lectures have more than one teaching assistant. The student only is not enough either: students attend many different lectures. Without the lecture name we don’t know which class we want to find a teaching assistant for. We note the functional dependency like this:
student, lecture → teaching assistant
And you always have plenty of trivial dependencies:
student, semester → student student, semester → semester lecture, semester, teaching assistant → lecture, teaching assistant
Example: Vertabelo Table
One more example: table with some Vertabelo data. In the table we have the name of the user, the id of the model, the role the user plays in this model (owner, editor, or viewer), and the number of tables in the model.
Some functional dependencies we have here:
The number of tables depends only on the model.
model → number of tables
The user role depends on both user and model.
user, model → role
Find all nontrivial functional dependencies in update anomalies example.