Most of the best books on data modeling were originally written decades ago. But the concepts they discuss are more relevant than ever; in today’s business world, most decisions are made on the basis of well-modeled data repositories.
With the rise of NoSQL databases, unstructured information, and Big Data repositories (where quantity seems to be more important than quality), you might think that data modeling is an archaic discipline. That notion can be maintained until inconsistent data, anomalies, and inaccurate reports start to pop up everywhere and you begin losing confidence in your data. It’s at this point you discover why you need data modeling and why it’s still relevant.
Data modeling is about to turn 50, but the concepts it handles have not changed substantially over that time. In fact, they are perhaps more relevant now than they were a few decades ago, as more business decisions are based on the analysis of reliable and consistent data sources. That is why books on the subject are still important, even if they were written decades ago. In addition, several of the authors of must-read data modeling books periodically generate new editions updated with the latest technologies and trends.
Below we share a list of the most recommended titles in data modeling books, starting with the more beginner-friendly ones up to the advanced and even philosophical aspects of data modeling. On a related topic, you might also want to check out our selection of the top books on database design.
Our Top 8 Books on Data Modeling
Data modeling books for beginners should have a well-organized table of contents, provide easy access to what you want to read, and include step-by-step guides that allow a full understanding of the process. All of this is found in Data Modeling: A Beginner's Guide. Case in point: beginner readers get to create databases by following the examples in the book and discover that they really work.
The structure of the book includes sections aimed at reinforcing the knowledge provided by the author, including Ask the Expert ( a Q&A section with lots of tips and additional information) and Try This (practical exercises that show how to apply knowledge). At the end of each chapter, the reader takes a test to verify that they’ve understood the concepts discussed.
The author, Andy Oppel, does a good job of not presenting his preferred method as if it were the only one. Instead, he shows different alternatives for arriving at the same result. In addition to Data Modeling: A Beginner’s Guide, he’s written many other database books, such as Databases Demystified, SQL Demystified, and Databases: A Beginner's Guide.
If you’re just starting with data modeling, this book’s practical examples of the main concepts are a valuable find. The content is easy to read and includes a few useful templates that novice modelers can use as a starting point for their creations.
No prior knowledge of technologies, structures, or databases is required to read this book. If you’re looking to dabble in data modeling from a non-technical background, such as that of a business analyst, it’s a good fit. But if you want to become a professional modeler or acquire full-blown database designer skills, the content might feel incomplete. Even so, the book lets you speak the same language and exchange ideas with a designer in charge of modeling.
It is striking, however, that there are chapters on very specific concepts (e.g. unstructured data and XML) that are not directly related to the day-to-day activity of a data modeler. They certainly provide useful information, but they seem to be randomly chosen topics occupying space where I’d rather see subjects more in line with data modeling.
Anyone who has learned data modeling through a university degree or a specific course will assume that the concepts Gavin Powell explains in this book are already familiar to them. However, it is worth reinforcing those concepts from time to time, and database books are a good way to do so. In Database Modeling Step by Step, the author explains the basic concepts more clearly than many academic texts and university professors.
Aiming to simplify relational database modeling, Database Modeling Step-by-Step presents the standard techniques for database normalization and then proposes its own methodology. It offers an intuitive way to build relational database models that you can put in practice with database design tools like Vertabelo.
This book is not pure theory. Its author explains how logical models are implemented in modern RDBMSs and whether to support online transaction processing (OLTP) systems (e.g. e-commerce sites) or analytical processing, Big Data, and data warehouses. It also includes a good introduction to the SQL language.
The historical and humorous quotes at the beginning of each chapter are an added value. They not only make the reading more enjoyable, but also help establish the scope of each chapter.
These three books cover all kinds of data models – both common and industry-specific – that can be customized for particular needs. You can use the models, designs, and scripts presented in these books as a starting point for developing your own models, for extracting ideas and validating existing models, or for learning about various functional areas.
In addition to providing examples and templates, the author offers techniques for working with data models, such as how to convert a model from one level to another and how to achieve the appropriate level of abstraction in each. He also provides tables with example data to bring the models to life.
The first two volumes include generic and specific models for a wide variety of industries and for an extensive set of typical business subject areas. The text is clearly written and the content goes beyond the obvious, reaching even some of the most intricate business reporting requirements. Volume 3 contains notions that may seem repetitive and unsuitable for novice modelers; for experienced modelers, it is a valuable addition that takes relational modeling to a new level.
There are those who prefer to learn data modeling by reading a good selection of database books instead of taking a university degree. Data Modeling Essentials is for these people. It should be noted that this is a thick and heavy book, so beginners will probably prefer to start with a shorter introductory read.
The book is properly organized in three parts. The first introduces the main concepts of data modeling and explains them in depth. It covers alternative models like ER and UML and notions such as normalization, generalization/specialization, relationships, attributes, etc. The second part deals with the data modeling cycle, explaining its phases from requirements gathering to the logical design of the model.
In the third part of the book, the authors discuss advanced normalization concepts (4th and 5th normal forms, Boyce-Codd normal form), explaining them in understandable and reasonable ways. They also discuss time-dependent data modeling and business rules.
To complete and update the content – which mainly covers data modeling for OLTP – the authors have added a few short chapters on data warehousing, data marts, enterprise data modeling, and general aspects of data management.
Learning the fundamentals of a data modeling technique is not the same as learning how to use it and apply it to real-life problems. Data Model Patterns seeks to help analysts who know how to create data models but need help in being able to develop them in a way that effectively solves an organization’s data needs.
The author, David Hay, is passionate about data modeling. He discusses the needs of areas such as accounting, manufacturing processes, and material requirements planning in different industries. Each chapter develops high-level data models for each of the business areas of an organization. However, because the book proposes its own methodology, it may be difficult to find a suitable design tool to work in the way the author proposes.
Some might think that, since there are so many off-the-shelf products to solve needs common to all companies, creating data models for those areas is like reinventing the wheel. However, it is very common for organizations to have subtle details that escape the generality; when trying to impose a standard model on them, users start using ancillary tools (e.g. their own Excel spreadsheets) to handle these peculiarities. Analyzing an organization’s requirements in depth and creating a data model to solve them is never a waste of time, even if you opt for an off-the-shelf solution to meet those requirements.
Many books can teach you the basics of data modeling, whether you are a beginner or an experienced designer. But to be sure that your data models are accurate, complete, and useful, you need a way to judge those qualities in your models. The Data Modeling Handbook offers an answer to that need – providing a wealth of real-life examples, annotated diagrams, and a rich catalog of rules and best practices. In addition, authors Michael Reingruber and William Gregory explain how to align your data models with organizational goals, policies, and plans.
The book presents the rules in a variety of notations, including Chen’s, Finkelstein’s IE notation, James Martin’s ERDs, and the IDEF1X notation for semantic information models. It compares the most popular modeling styles and demonstrates that good models can be built using any type of notation. It also describes how to use CASE tools to build models efficiently without neglecting accuracy.
Finally, The Data Modeling Handbook proposes a detailed guide for establishing an ongoing quality assessment program that does not impede the data model design workflow.
This book talks about the ambiguities that make every data modeling project difficult – and eventually lead us to failure if we do not deal with them properly. It also delves into the more philosophical aspects of database design, venturing into crucial issues that were raised in the 1970s and are still valid today. The topics are presented in a technology-agnostic way and with a pleasant, easy-to-read, and sometimes humorous writing style.
Don’t expect to find a practical guide to data modeling in the pages of Data and Reality; it’s advisable to have prior database design knowledge to get the most out of this book. But if you’ve already designed a few databases and have had to jump through some hoops, you’ll find a few concepts that will prove invaluable.
The author, William Kent, emphasizes the notion that every model is imperfect due to the complexity and ambiguity of reality. He posits that simplistic attempts to capture reality in a “pseudo-exact” way lead straight to nowhere. Far from presenting a bleak picture for database designers, Kent gets right to the heart of the key concepts we need to keep in mind to make a data model successful.
A Data Modeling Book for Every Occasion
When looking for books on data modeling, you may prefer one that gets you started in the practice from scratch, or one that teaches everything from A to Z, or one that you can keep handy to refer to when you need to solve a particular problem, or even one that is a fun read to take with you on vacation. Whatever your preferences, the right book for you is surely in the list we have presented here. ¡Happy reading!