How To: Model Inheritance in Databases

30 בספטמבר 2007

How To: Model Inheritance in Databases


Playing with some Object Relational Mapping technologies these days such as Linq to SQL and ADO.Net Entity Framework, one of the capabilities that is important to check is which strategies of inheritance modeling is supported by each technology.


This post talks about how to model inheritance in the database, regardless the technology that we will use later to talk to the database.


Assuming that the business entities look somewhat similar to this:


 Model Inheritance in Databases


There are three common strategies of modeling this class hierarchy in the database:


Table per Class Hierarchy


This strategy means that all the class hierarchy is stored in a single table in the database. The top level columns can allow nulls but all the lower level columns must allow nulls. There is an additional column (in the following example: PersonType) which is the discriminator – it holds a value indicating the level in the hierarchy to which the record matches.


Table per Class Hierarchy


This strategy is very useful when selecting rows from the database of multiple person types. The select goes only to this table and therefore more efficient. It is also very simple to add a column to all the person types – we just have to add it to this table. The downside of this strategy is that it cannot enforce constrains such as not allowing nulls in columns that are not in the top level. It also may lead to many columns with null values and therefore to wasting some disk space.


Table per Subclass


This strategy means that each level in the class hierarchy is represented in a table in the database. The top level table (Person) contains the columns that are common to all levels and as we go down in the hierarchy, we get only the additional columns for the specific level. The root table contains the discriminator columns, for joining with the appropriate table when selecting rows.


Table per Subclass


This strategy there can require less disk space, and can enforce required columns. It also allows adding new columns to subclasses without having to change other tables. The down side of this strategy is the cost of each select statements and the amount of joins it requires.


Table per Concrete Class


Since not all the classes in the above diagram are concrete (Person and Employee classes are abstract), there is no need to create tables for each level in the hierarchy. In this strategy each concrete class has its own table.


Table per Concrete Class


Notice that there isn't any table that contain any common columns. This means that adding columns to all tables should be done on each of the tables. Selecting rows is very efficient when selecting a single type of person, but can cost much more when we need to union between the tables in order to get multiple types or find the type of the person only by its primary key.


Conclusion


There are 3 common ways of modeling inheritance in databases, each on of them has its pros and cons. It is very important to be familiar with those strategies when investigating ORM technologies.


Enjoy!

Add comment
facebook linkedin twitter email

14 comments

  1. Anonymous30 בספטמבר 2007 ב 18:37

    Another one:
    Union of all classes; needs a flag(column) that discriminates between types.

    In most cases 'Table per Subclass' is the best choice.

  2. Wayne1 בנובמבר 2007 ב 5:06

    Got a question for ya. In the 'Table per Subclass' model, how are joins made between tables? Is the PersonID key propagated to its children? If not how are they joined. Also, with this model, how does one join to the correct subclass? I notice the Person entity includes personType Attribute. How is that used in a SQL statement to find the proper child entity?

  3. Guy Burstein [MVP]1 בנובמבר 2007 ב 22:04

    Hi Wayne,

    This is the major downside of this approach. You have to know what type you're looking for when you select the data, in order to select from the right table.
    Why do you need to join between two subclasses (two sub-tables)?

    Guy

  4. Tamir Shlomi3 בפברואר 2008 ב 15:19

    Great stuff.

    Thanx a lot,
    Tamir Shlomi.

  5. André Ricardo14 במאי 2008 ב 0:10

    And if i want to search for Bill in the database?

    Like:
    SELECT * FROM SalesPerson, Programmer, Contact WHERE SalesPerson.FirstName = 'Bill' OR Programmer.FirstName = 'Bill' OR Contact.FirstName = 'Bill'

    ??????

    I'm very, very lost..

    I'm trying to implement inheritance in SQL but i don't think i have enough knowledge to do this…

  6. Bob Tabor5 ביוני 2008 ב 21:12

    Wow, awesome article.

    I'll take a crack at answering two questions I didn't see answered above … I'll disclaim any real expertise in this area, so take this with a grain of salt.

    @Guy … joining between two subclasses … what came to mind first is the scenario where employee hierarchy … I'm a manager of some, and I report to others. Another example would be things that are composites of (potentially) other composites for systems that have to allow for product customization. I haven't actually sat down to design this, so I might be missing something obvious here.

    @Andre … I think you would want to avoid the "Table per Concrete Class" approach, which seems to be what you're modeling your query after. Either of the other two approaches would work for what you're trying to accomplish.

    Hope that helps!

  7. Peter1 ביולי 2008 ב 1:05

    It seems to me that the Table per Concrete Class has nothing to do with inheritance since every class is mapped to a table.

  8. Peter1 ביולי 2008 ב 1:20

    Is the Table per Subclass approach actually same as the Table Per Type?

    Thanks.

  9. Peter1 ביולי 2008 ב 22:52

    I think I have misunderstood the Table per Concrete Class concept (or the whole concept about model inheritance) earlier (the July 1, 2008 1:05AM comment).

    I'm confused about the terms related to model interance in databases since MS and other articles are using different terms. Here is my understanding about the terms:

    Table per subclass = Table per type (TPT) = Table per class
    Table per abstract class = Table per Hierarchy (TPH)

    Is my understanding correct?

  10. Gus26 באוגוסט 2008 ב 8:14

    @Peter. Table per abstract class != Table per Hierarchy. In TPH you have only one table mapped to the Base class, in this example Person.

    But since there might be other abstract classes derived from an abstract class, in the TPA you might end up with more than one table for one Hierarchy.

    I hope to have cleared that one up.

    Cheers!

  11. Tawhid18 בנובמבר 2008 ב 12:08

    Great stuff Bursteins.

    I'm trying to model my companies data using SID, and found this article good for recollecting UML techniques.

    Cheers
    Tawhid

  12. computer error comedy film20 במאי 2010 ב 0:09

    Some of the harsher colleges of the valve were allowed down as the form finished more traditional; for bug, in some personal services of the proximity, sullivan included an low, but this object was then grey from the european law.

  13. rezaxm26 ביולי 2011 ב 18:48

    how can i implement table per subclass strategy?
    i have a posts table (post_id(KEY), date, text)
    and a comments table(comment_id(KEY)->posts.post_id, post_id->posts.post_id)

    i try to do this by replace comment_id>post_id relation to inheritance in o/r designer, but it cause this error:

    the selected classes are mapped to different sources or the child class containts a primary key. classes participanting in an inheritance must map to same source and the derived class cannot have a primary key.

    :(

  14. Milox12 באוקטובר 2011 ב 23:04

    awesome man, great explanation!

Comments are closed.