How To: Model Inheritance in Databases
Playing with some Object Relational Mapping technologies these days such as Linq to SQL and ADO.Net Entity Framework, one of the capabilities that is important to check is which strategies of inheritance modeling is supported by each technology.
This post talks about how to model inheritance in the database, regardless the technology that we will use later to talk to the database.
Assuming that the business entities look somewhat similar to this:
There are three common strategies of modeling this class hierarchy in the database:
Table per Class Hierarchy
This strategy means that all the class hierarchy is stored in a single table in the database. The top level columns can allow nulls but all the lower level columns must allow nulls. There is an additional column (in the following example: PersonType) which is the discriminator – it holds a value indicating the level in the hierarchy to which the record matches.
This strategy is very useful when selecting rows from the database of multiple person types. The select goes only to this table and therefore more efficient. It is also very simple to add a column to all the person types – we just have to add it to this table. The downside of this strategy is that it cannot enforce constrains such as not allowing nulls in columns that are not in the top level. It also may lead to many columns with null values and therefore to wasting some disk space.
Table per Subclass
This strategy means that each level in the class hierarchy is represented in a table in the database. The top level table (Person) contains the columns that are common to all levels and as we go down in the hierarchy, we get only the additional columns for the specific level. The root table contains the discriminator columns, for joining with the appropriate table when selecting rows.
This strategy there can require less disk space, and can enforce required columns. It also allows adding new columns to subclasses without having to change other tables. The down side of this strategy is the cost of each select statements and the amount of joins it requires.
Table per Concrete Class
Since not all the classes in the above diagram are concrete (Person and Employee classes are abstract), there is no need to create tables for each level in the hierarchy. In this strategy each concrete class has its own table.
Notice that there isn't any table that contain any common columns. This means that adding columns to all tables should be done on each of the tables. Selecting rows is very efficient when selecting a single type of person, but can cost much more when we need to union between the tables in order to get multiple types or find the type of the person only by its primary key.
There are 3 common ways of modeling inheritance in databases, each on of them has its pros and cons. It is very important to be familiar with those strategies when investigating ORM technologies.