- Book and Author
- Importance of Data Model Quality
- Takeoff Checklist
- Merciless Review
- Merciless Humiliation?
- Is it Agile?
Book and Author
Today I’m going to review “A Check List for Doing Data Model Design Reviews” by Kent Graziano. This publication is available as an e-book on Amazon.com.
The book is very short – it will take you less than an hour to read it. But don’t let the small volume mislead you. Graziano’s writing is very concise and full of information. There are no digressions, examples, anecdotes or metaphors. Just raw, austere instruction on what to check when designing a data model and why it is worthwhile to have the whole team review it before moving to the next steps of implementation. Even the title of the book is like the book itself – clear, not very attractive, but concise.
If you are looking for a tutorial on how to design good data models, you will be disappointed. The book is dedicated for professional data modelers who know their stuff very well. It’s like a takeoff checklist for airplane pilots – it tells you what to check, but not how to do it. And it tells you to check it twice or more, engaging your whole team.
You will see from the first chapter that the book has been written by a database modeling guru who knows his job well and understands the complexity of data models and the consequences of bad design. Indeed, Kent Graziano (aka “The Data Warrior”) is a titled, certified data modeler and data architect with over thirty years of experience, who has been using Oracle and supporting the Oracle community for over twenty-five years now.
Following is a brief overview of the contents of the book and my comments on its major theses, with references to my own data modeling and development experience and intuition.
Importance of Data Model Quality
A data model for an IT project is like the foundation of a building. If the foundation is weak the building may crumble, independent on how well the upper stories are constructed. If you design the data model badly, you may lose tens or hundreds of man-hours trying to fix it, and finally come out with a solution that is neither nice, nor maintainable or satisfactory.
This knowledge is the main rationale for the book – Graziano shows two ways of reducing the risk of designing a bad data model:
- strict, standardized constraints on model creation process,
- peer reviews as an optimal way of enforcing these constraints.
There are two checklists in the book – one referring to the logical data model, and the second referring to the physical data model. They specify what to check to make sure the model does not contain errors.
Both lists are similar and quite high-level, the Graziano assumes that the target reader is a practitioner database modeler and therefore no detailed explanations are needed. The list elements are just indications, with no details on how to do them. Despite their generality, both lists are valuable and worth using when designing or checking data models.
One thing I am missing in the lists is referring to data. In my experience, so called “test data” is a quite important aspect of providing project quality. These data should cover the most important border cases for the system, thus allowing developers and testers to test more easily after feeding the database with test data. The test data is prepared by the data architect who knows the data model best, and not by developers, who usually tend to test only the simplest test scenarios, omitting border cases.
The data model development process suggested by Graziano consists of the following main steps:
- developing the logical data model (entity relationship diagram),
- one or more team reviews and correction iterations,
- developing the physical data model,
- one or more team reviews and corrections,
- generating model SQL,
- handing off the model to developers.
The key element here is peer review. The review is performed by the whole team, including developers involved in future implementation.
The result of a review is boolean – either the model passes the review and is accepted, or it is rejected, sometimes in a very spectacular way, like crumpling the printed model and tossing it at the modeler, or stomped at to leave a footprint to show rejection of model. The review is repeated on corrected versions of the model until it is accepted by the whole team.
According to Graziano, the above process is to be followed mercilessly – independent of how urgent a change to the model is, it must undergo meticulous review. In some cases the size of the review team may be reduced to shorten time needed for review.
I think that the “indispensable review” is a very good approach to ensuring data model quality. It significantly reduces the chances of missing something important and helps propagate knowledge about the model through the whole data modeler team.
However, it is not cheap. And it is not applicable to every project. Why not? Because you may not have a team of data modelers at your disposal. Because your data modeler may be the most experienced person in your team. And your team may be in a hurry and do the review carelessly and over-tolerantly.
Graziano mentions that changes should be reviewed in small sets if possible to complete the review in a reasonable time. I agree with that but it is sometimes impossible to keep changes small. From time to time you may need to refactor significant parts of a model. Predicting all consequences of big changes may be too difficult for some teams.
Eventually, when it comes to agile, SCRUM-like development, where the customer can see all the costs the team bears during the development – will the customer be willing to pay more for such a costly quality assurance process? Or will they accept the risks in exchange for cheaper and quicker delivery?
I think that applicability of the pro-review approach suggested by Graziano depends on project budget, type, criticality of data being processed, deadlines and methodology of project management.
It is not applicable for every project, yet it is worth considering with a high chance of success if applied – quoting the author – mercilessly.
The most doubtful element of the approach suggested by Graziano is rejecting the review. While reading the book, I had an impression that Graziano appreciates spectacular and brutal model rejections during reviews. He seems to find it highly motivating for the modeler to have their model tossed at them as a sign of rejection by the team. He says it is “all in good fun”.
Graziano does not mention the mental and team costs of such harsh methods of cooperation. He just warns that team members should not “cross the line” and that inexperienced modelers could be intimidated by the methods.
I am sure such an attack by the team is shocking and motivating the first time around. But in general – it is frustrating and damaging to the team.
It is not clear to me what motivates people more – a perspective of reward, fear of shame or punishment, or loss aversion.
But I would not like to be a modeler whose work is rejected the way mentioned by Graziano. I think I would find such an attitude of behalf of my team aggressive and discouraging. And I don’t believe a good team can be built from people who are aggressive towards each other.
The only situation I find such aggressive rejection justified is when a modeler refuses to learn and does the same errors again and again. But still I would rather see them warned and fired from the job than humiliated in such a situation.
People usually want to do their work well. They usually don’t want to get humiliated for errors.
Is it Agile?
Graziano claims that to be agile, you need to have a solid and repeatable process of development and that his model development process is agile.
I agree with the first statement. To be agile you have to be predictable and efficient.
However, the process showed by Graziano is not very agile as a whole. It assumes that no model is delivered before the deadline and that the model is delivered to the developers after numerous reviews, fully ready. It assumes there is a team of data architects who work together.
From what I understand about the agile process, I would assume that the model should be delivered by data modelers and data architects as quickly as possible, reviewed by the whole agile team (including developers), and there would be no single moment of handing it off to developers. Instead, the model would be reviewed on-the-fly and the parts of the model that are accepted by the review would be instantly passed to implementation. This would be agile.
The book is interesting and its small size is an advantage for me. It is a piece of good, understandable technical writing.
The approach to model building quality is correct. I find it perfectly applicable to big, mission-critical applications when budgets can afford it. They should have stable teams of data modelers who know each other well and have ‘thick skin’ and don’t get offended by aggressive review rejections.
However, I find it dangerous for teams and too expensive for small- and middle-sized projects in most cases. In such projects I think the approach can be applied in part, introducing reviews for critical model changes and skipping it for smaller ones.
I also think Graziano may underestimate the psychological aspect of the matter as he probably hardly ever gets his models rejected, due to his very high skills and experience.