How to Model Inheritance in a Relational Database

by
Patrycja Dybka
Community Manager at Vertabelo

Posted: July 28, 2014

In the process of designing our entity relationship diagram for a database, we may find that attributes of two or more entities overlap, meaning that these entities seem very similar but still have a few differences. In this case, we may create a subtype of the parent entity that contains distinct attributes. A parent entity becomes a supertype that has a relationship with one or more subtypes.

First, let’s take a closer look at a simple class diagram.

The UML symbol for a subclass association is an open arrowhead that points to the parent class.

The subclass association line is labeled with specialization constraints. Constraints are described along two dimensions:

  1. incomplete/complete
    • In an incomplete specialization only some instances of the parent class are specialized (have unique attributes). Other instances of the parent class have only the common attributes.
    • In a complete specialization, every instance of the parent class has one or more unique attributes that are not common to the parent class.
  2. disjoint/overlapping
    • In a disjoint specialization, an object could be a member of only one specialized subclass.
    • In an overlapping specialization, an object could be a member of more than one specialized subclass.

The following diagram presents class Client.

In class Client we distinguish two subtypes: Individual and Company. This specialization is disjoint (client can be an individual or a company) and complete (these are all possible subtypes for supertype).

Inheritance

Let’s model this situation and discuss the results (I will use Vertabelo, our online database modeling tool).

One table implementation

In a one table implementation, table client has attributes of both types.

The diagram below shows the table client and two views: individual and company:




In this implementation:

  • Access to supertype rows is optimal (it is simple to have a list of all clients, it’s not necessary to make costly joins)
  • Effectiveness problem with access to subtype rows. Some rows have to be discarded, because one table contains rows for all (in this case two) subtypes.
  • There has to be an additional attribute to specify the subtype (attribute ‘type’ may have value ‘i’ for ‘individual’ and ‘c’ for ‘company’ and no other value).
  • It's easy to change the object’s subtype (we have to change ‘type’ from ‘i’ to ‘c’ or the other way around).
  • Many attributes are subtype-specific, and these columns have null values on rows that apply to other subtype’s attributes.

CREATE VIEW individual AS
select id, address, name, surname 
from client
where status = 'i';

CREATE VIEW company AS
select id, address, name, surname 
from client
where status = 'c';

Two-table implementation

In a two-table implementation, we create a table for each of the subtypes. Each table gets a column for all attributes of the supertype and also a column for each attribute belonging to the subtype. Access to information in this situation is limited, that’s why it is important to create a view that is the union of the tables. We can add an additional attribute called ‘type’ that describes the subtype.

The diagram below presents two tables, individual and company, and a view (the blue one) called client.




The view’s script is as follows:

CREATE VIEW client (id, address, name, surname,
company_name, industry) AS 
select id, address,name,
surname,null, null, type
from individual
union all
select id, address, null, null,
company_name, industry, type 
from company;

However, the example above generates some problems:

  • problem with implementation UID for supertype. There is no common ID.
  • if we want to change object’s type, for example, company to client, we need to do many INSERT, DELETE operations

Three-table implementation

In a third solution we create a single table client_t for the parent table, containing common attributes for all subtypes, and tables for each subtype (individual_t and company_t) where the primary key in client_t (base table) determines foreign keys in dependent tables. There are three views: client, individual and company.




CREATE VIEW client AS
select
  client_t.id,
  client_t.address,
  company_t.company_name,
  company_t.industry,
  individual_t.name,
  individual_t.surname
  from client_t, individual_t, company_t
  where client_t.id = individual_t.client_id
  and client_t.id = company_t.client_id;

CREATE VIEW individual AS
select
  client_t.id,
  client_t.address,
  individual_t.name,
  individual_t.surname
  from client_t, individual_t
  where client_t.id = individual_t.client_id;

CREATE VIEW company AS
select
  client_t.id,
  client_t.address,
  company_t.company_name,
  company_t.industry
  from client_t, company_t
  where client_t.id = company_t.client_id;

In this situation:

  • it’s necessary to make (too) many (too) costly joins,
  • changing the subtype is not effective,
  • there is no problem with UIDs, because UIDs for subtypes are determined by the supertype’s UID.
  • It’s very easy to search against all subtypes (dependent table’s foreign key is also a primary key)
 
 

Try our online database modeler. No registration. No commitments.

 
 
Tags
 
 
Subscribe to our newsletter

If you find this article useful, join our weekly newsletter to be notified about the latest posts.

 
 
 
Vertabelo Academy It's time to speak the new lingua franca of the Web! Online Course ● Tons of Exercises ● Designed for Beginners DETAILS Check our other courses: