The greatest satisfaction of database modelers is in seeing their creations transformed into efficient repositories of information for business processes. Of course, modeling databases can also have its share of frustrations, but we'll get to that.
What makes building scale models fun? Whether it’s RC cars, airplanes, ships, or a science-fiction spacecraft, scale models allow you to build a miniature representation of huge real-life objects.
I was fascinated by scale models when I was a kid. Maybe that’s why I find a similar fascination with building data models today. Sure, many things about database modeling differ from scale models – primarily that data models are an abstraction of reality rather than a miniature physical representation.
This means we have to use our imagination to look at a data model and think of it as a mechanism that mimics real-world business processes. This is precisely why fanatics like me think database modeling is also an art.
Artists take their perception of reality, pass it through the sieve of their imagination, and represent/communicate it through elements they can model: clay, canvas and paint, stone and chisel, pages, words, or sounds. We, data modelers, do almost the same, using ER (entity-relationship) diagrams and data objects.
The Artist and the Designer
The difference between artists and database designers is that artists are not always concerned about whether the public understands their art. Designers, on the other hand, are obliged to make sure their audience understands their models.
That’s why you need data modeling. After all, a data model must solve a need: to provide a useful and practical repository for storing data about a problem domain, from which information can be extracted that expands the users’ knowledge.
In addition, individual artists tend to manage their time and work at their own pace in their creative space. In contrast, database designers must adjust to the timeline and the rhythm of the project, constantly interacting with co-workers so that our work is always appropriate to the needs of the organization sponsoring it.
Organizations need data modeling to create the foundations for all their critical information repositories. And our work as database designers does not end when it is exposed – that is when the data model begins to work and gets populated with information. From that moment on, we must be attentive and willing to make changes to the model when needs arise: to speed up queries or processes that take longer than necessary, to adapt to the changes in business needs, to account for changes in the infrastructure that supports it, or simply to improve/correct to a better design.
Normalization, the Magic Wand of Database Modeling
Anyone can create tables in a database. A non-designer can create relationships between the tables with just a little knowledge. But normalization is probably the main difference between a model created by a database designer and one created by someone else.
Mastering normalization techniques allows us to show off our skills in data modeling. This is just like when a car mechanic shows off his knowledge by taking mere 5 seconds to find the root of the problem in our engine after we have spent months trying to figure it out on our own. Knowing how to normalize a database and how to explain the benefits of a normalized database is one of the main arguments for justifying our salary as database designers.
Of course, there are many other situations where we can show off our knowledge – for example, defining the optimal keys for the tables in a database. Making use of this skill ensures programmers can easily write their queries and the database responds quickly to any need.
The Challenges of Changes in Database Design
There is a word that sends shivers down our spines in the world of software development: change. Whether functional, infrastructure, or performance, changes in system requirements are a threat that even the perfect database design cannot escape.
Entire books and university courses teach how to deal with changes. The main objective of Agile methodologies is to adequately manage changes to prevent them from causing delays in project deadlines or from diminishing the quality of a software product.
But none of this prevents us from wiping the smile off our faces when we need to introduce changes to a data model. The situation is not so serious if the need for changes arises in the early stages of a project when the model is freshly created and does not yet contain real information. However, more often than we would like, changes arise when the model is operational and the entire organization depends on it to function uninterrupted.
We have to look on the bright side. Knowing how to adapt a data model to any kind of change, at any stage of its life cycle, is an interesting challenge that gives us experience and helps us grow professionally.
The Challenges of Inheritance
Another interesting challenge we database designers face is that of legacy databases. We do not always have the good fortune of creating new data models without a history of defects, patches, and poorly thought-out adaptations.
Every designer, at some point in his or her career, must confront a legacy database and work with it, either to improve its design or to migrate it to a fresh, new schema. Either option is extremely attractive to those of us who welcome challenges with open arms.
Legacy databases have an advantage that new databases do not: they already have information that speaks for itself about their rules and conditions. When designing a new database, we must set the rules and conditions based on what the users tell us, with the risk we discover too late in the process that the information they gave us is inaccurate. In a legacy database, there is no room for doubt: you can run a few quick queries to find out what rules and conditions are imposed on the data.
In my experience, I have found it very useful to work with a data model populated with information. If I am working on a new database, I try to generate the physical data model as soon as possible, implement it on a test or development database engine, and populate it with real data. It doesn't matter if the design work has just begun and the database has only a handful of tables.
Having real information in the tables allows us to anticipate the behavior of the data, to determine if what the users are telling us is true, or to assess if their definitions are flawed. We can also see the impact of any design changes on the data.
Moments to Forget
Go-live moments are always a source of stress and angst. Any software developer who has experienced a go-live knows the anxiety of moving to production.
No matter how much testing has been done already, there is always that feeling something is missing or something unexpected may happen. In my experience, if you feel too confident when putting a database in production, you likely don't have a clear and accurate idea of what might happen or of the risks involved in the move.
I have experienced traumatic go-live situations throughout my career. It becomes apparent something is not working as planned, and decisions have to be made in the heat of the moment.
Something that has happened to me more than once is to discover that the volume and the frequency of the transactions received by the database are higher than expected. They say humans are the only animals that trip over the same stone twice! You need an action plan for when the user screens start freezing or displaying unexpected errors, or when deadlock errors start spreading like a virus.
The best medicine is good prevention. When moving a database from sandbox to production, good prevention is a testing process that puts much more stress on the database than can be expected in the worst-case scenario. For example, if the estimates say your database should support a maximum of 10 concurrent transactions, the test should simulate 50 concurrent transactions to provide absolute confidence.
I don’t want to relive those traumatic moments of the moves to production. But I have learned many valuable lessons from them I wouldn’t have found in books.
For example, at one point, I did not trust the auto-increment functions in databases for defining unique IDs. I had believed applications should have control of ID generation. This was until a high transaction volume caused an application to generate duplicate IDs, causing primary key repetition errors. Now, I always try to use auto-increment keys in transactional data tables.
Experience has also taught me that the database is the last line of defense against computer attacks. We, database designers, tend to work with the peace of mind that our databases live in a foolproof glass box, far away and safe from the risks of attacks. We are confident that threats are kept at bay, thanks to the protection mechanisms implemented at the infrastructure and application levels.
But if database engines offer robust security schemes, why not take advantage of them right from the database design? It is important to design with security criteria in mind when working with databases in systems that store sensitive information.
This means asking DBAs to define database connection credentials with different privileges and asking developers to assign a different credential to each user profile according to the access level required by each. Then, in our designs, we can partition the schema by access level to ensure each user accesses only the objects they need.
Satisfaction and Excitement From Database Modeling
Beyond the annoyances and headaches of going to production, when we see our scale model – our database design – transformed into a big working machine, all the bad is outweighed by the satisfaction of a job well done. It is exciting to see an application respond by displaying accurate information in the time expected. It is even more exciting to see the astonished faces of the users when the data reveals insights that challenge their notions about the business they thought they knew perfectly well!