Create good logical, physical, and conceptual data models using these expert database model preparation tips.
To put it simply, a database model is a data model that determines the logical or physical structure of a database. Database design is the process of creating a database model. A database model is used to capture information about the data that must be stored in a database.
If you're a bit unsure about the steps involved in the database design process, I suggest you read this description of database design steps. And if you are unsure about how to perform these steps, check out these tips for better database design.
In the data modeling process, Step 1 is gathering business requirements. This is where the database model is key.
Start Your Model
To create the model, you first need to talk with business users, which will help you understand what the requirements are. They will advise you on the information that you need to store. This will help lay the foundation for tables and their attributes (columns) and how the information is linked together (the relationships between tables).
In addition to the business domain and its associated challenges, you need to know how the database and its data will be used. For example, if you are storing customer data, what information do you need to record Why are you storing customers’ addresses – will you be sending them invoices? If so, how? By snail mail or electronically? The answer to those questions will impact your data model by indicating what information is mandatory or optional.
Once you fully know the requirements, you can start getting your modeling ideas together. You could write out the model for your database, but that would probably need a lot of words. Remember: A picture is worth a thousand words. That is where a database model diagram, referred to as an entity-relationship diagram (ERD), is used. This "picture" of a database model provides a straightforward way to communicate between business users, data modelers, and the development team. ERDs use a standard notation (boxes, lines, etc.) to document the database design. You can also find additional information about ERD symbols here.
However, database design is more than just an ERD. A good database model needs to go beyond the ERD. Unfortunately, you may find data modelers that think that their only contribution is the ERD. They attempt to create a data model that is a work of art, while some data modelers just worry about creating a few boxes and lines. The ERD is a key contribution from the data modeler, but it is also essential to keep the model up to date throughout the software development lifecycle. (More on that later in this article.)
Ingredients of a Data Model
Depending on the type of data model that you are creating (conceptual, logical or physical), your data model will contain more or less detail. Generally speaking, a data model contains information about the data to be stored.
- At its most basic level, that data is organized into entities which are stored in database tables. The basic element of any data model is a table.
- Within a table, we store rows of data. Each row may contain many columns, which are the attributes of the item that we are storing in the table.
- Each column (or attribute) will store a particular type of data, whether that’s a string of characters, a number with or without decimal digits, a Boolean (true or false) value, etc.
- Some columns may be required (mandatory); some may be optional. Some columns have constraints on the types of values they can store.
- And tables are generally linked together through relationships. How is the information related?
Here is a simple, not-yet-complete logical data model for data on customers, suppliers, invoices, payments, and accounts.
I used Vertabelo to create the above model. With a database design tool, I don't need to worry too much about the exact notation. I focus on the design while the modeling tool implements the entity-relationship diagram notations and symbols.
Reuse Your Data Models
Different types of data models are often created during the database development process. Typically, we work with 3 levels of modeling:
- Conceptual: As its name implies, this high-level model generally has the least amount of detail. It answers the question: What are the main concepts involved in the data model?
- Logical: The logical data model is where we get into the details. We plan what we are storing, but we don’t go into the details of how we’ll store it. This model answers the question: What are all the data elements that we need to manage?
- Physical: The physical data model is basically a conversion of the logical data model into a detailed implementation for a specific database management system (DBMS). We need to follow the rules imposed by the particular DBMS and the particular version with which we will be working.
If you've heard that logical data modeling is dead or no longer required in the Agile software development process, you might find this article interesting.
A good data modeling tool will help us re-use our modeling work to go from one level to the next (e.g. from logical to physical). In this way, you’re not throwing away the work you do at the conceptual level. If you want to learn more, I recommend these articles on expanding a conceptual data model into a logical data model and generating a physical data model from your logical data model.
Of course, a physical data model is not the finished solution: you need to actually create the physical database. Good data modeling tools provide you with the means to generate the SQL (or DDL: Data Definition Language) scripts from your physical model. This gives us the technical definition of the database, which is used to automatically generate the necessary database with which the application will interact.
Working Together on Data Modeling
In addition, a good database modeling tool allows the model to be created collaboratively. It is important to remember that database design is more than just an ERD. Database design needs to go beyond the ERD; it is part of the software development lifecycle. We collaborate with business users to create the initial logical design; we also collaborate with the development team to update the logical and physical data models during the software development lifecycle.
If you are interested in using collaborative database design, follow this advice to get started using Vertabelo.
Other Database Modeling Challenges
If you’re modeling collaboratively, it can happen that someone accidentally makes a change that negatively impacts the model. Obviously, they can undo their changes – but if the changes were made a few days ago, that may no longer be possible. Vertabelo includes a version management system that includes manual and automatic saving modes and version tagging. If your database is actually used with a particular version of the associated software, you can track that tagged version and compare any future changes.
Another challenge that I have often faced is working with an existing database for which I have no data model. Vertabelo includes tools to reverse engineer from an SQL DDL script into a data model. Now, I doubt that you would reverse engineer from the SQL DDL scripts all the way back up to a conceptual data model, but you theoretically could. (Start from the scripts to the physical data model. Then, from the physical data model work back to the logical data model, and finally simplify your logical data model to a conceptual data model that accurately reflects the concepts of the business domain modeled in the existing database. This bottom-up reverse engineering can be useful when you are creating a new data model but you have an existing database that’s similar to your requirements.)
Another common challenge is migrating from a currently active database version to a new database model. Vertabelo can auto-magically generate migration scripts between versions of your data model. Manually writing and testing (and re-testing) migration scripts can be a real nightmare. Imagine if you could count on a tool to do the hard work for you! Vertabelo easily generates migration scripts between two versions of a database.
I must admit this is a dream come true! The tool manages versions and determines what the impact is of the changes made between different versions.
Data Modeling Best Practices
There are numerous best practices to be considered in data modeling. I will not even try to list them all here, but I will point you to other sources, such as this article. For now, here are a few that I think are essential:
- Think outside the database.
- Consider the nature of your application.
- Break your model into logical pieces.
- Document your model.
- Use naming, notation, and design conventions.
- Test early and test often.