People love to communicate. We often joke that any software system always evolves into a messaging system. This article will explain the system requirements and step by step approach to design a data model for a messaging system.
Requirements in Nutshell
The core functionality of a messaging system in an application is to send notifications/messages to a user or a set of users. Our system also allows one to send messages to a user group. User groups can obviously be formed on some parameters like access privileges, geographical location of users, etc.
This system allows receivers to respond to the messages. It also keeps track of who has read the message and who has not.
In addition, the system has a built-in reminder mechanism that allows a sender to create a reminder, and then sends a reminder to all receivers accordingly.
Entities and Relationships
In this data model,
message are the main entities to store users’ and messages’ details.
Columns in the
user table would be user related attributes like
Some self-explanatory columns in the
message table would be
expiry_date. I also add a foreign key column called
creator_id in this table that refers to the
id column of
user table. As its name suggests, it signifies the id of the creator of a message. Since there would always be one creator for a message, I keep this column in the message table only. You might be wondering why there is an
expiry_date column in the table. I have added this column to manage reminders on a message. I will explain more about this column later in this article.
The most important table in this data model is
message_recipient. I would say the whole data model revolves around this table only. One of the main objectives behind creating this table is to hold the mapping between messages and their recipients. Thus the
recipient_id column in this table signifies recipients’ ids, and this column refers to the id column of
When a message is sent to one recipient, one record will be inserted into this table with the recipient’s id in the
Now you may be wondering what the
recipient_group_id column signifies in this table. Here, I should first explain how this model can be extended to a requirement of sending messages to multiple recipients at once.
Sending Message to a Group
I need another table, namely
group, to hold group details. Since there is a many-to-many relationship between the
group tables, i.e. a user can be part of more than one group, I will create another table called
For example, if a group is formed with 25 users, there would be 25 records, one for each user, in the
Let’s get back to the
message_recipient table. I add a reference to the primary key of the
user_group table into the
message_recipient table. I name it
recipient_group_id. This column will hold the value of the user-group for which the message is sent.
Now whenever a message is sent to a group, multiple records will be inserted into the
message_recipient table based on the number of users in the group, and the
recipient_group_id will be logged accordingly against all those records.
Let me illustrate it further with an example. Suppose a message is sent to a group of 10 people. In this case, a total of 10 records, one for each
recipient_group_id of the group, will be inserted into the
Please note that if the message is sent to a user, not a group, then the
recipient_group_id column remains empty. In this case, the direct
user_id will be logged under the
I will add one more column called
is_read into the table to hold a flag against a message-user that signifies whether or not the message is read by the user.
Unique key in
message_recipient table – There should be a composite unique key on the columns
recipient_group_id, to ensure that only one record exists for a unique combination of these columns.
I keep the
is_active column in all tables, except the message and message_recipient tables, in order to enable a ‘soft delete’ of records. Since I have added an
expiry_date column in the message table, an
is_active column is not needed. Moreover, this column is not needed in the
message_recipient table because a message cannot be reverted directly once it is sent. However one can make it inactive by updating the
expiry_date for the message to a date in the past.
Replying to a Message
Now suppose the system allows users to respond to received messages. I extend the same table
message to cater this requirement instead of creating a new table for replies. I will add one column called
parent_message_id to establish a hierarchical relationship between messages. I will insert a new record for reply message, and update the
parent_message_id column for reply messages. This model supports n-level of hierarchical relationship, i.e. reply on reply message can also be tracked through this model.
Dashboard to View ‘Read %’ of Each Message
is_read flag is logged against each message-user record. The value for this flag remains ZERO until the message is read by the user. It will be updated to ONE as soon as the message is read by the user. Based on the column value, one can determine ‘read %’ for a message that is sent to a group.
Let me write a sample SQL to fetch such a report:
SELECT msg.subject, sent_to, msg.create_date, (summ / countt) * 100 AS Read_Per FROM (SELECT msg.subject, grp.name as sent_to, msg.create_date, SUM (is_read) AS summ, COUNT (is_read) AS countt FROM message_recipient msgrec, message msg, user_group ug, group grp WHERE msgrec.message_id = msg.id AND msgrec.recipient_group_id = ug.id AND ug.GROUP_ID = grp.id AND msgrec.recipient_group_id IS NOT NULL GROUP BY msg.subject, grp.name, msg.create_date UNION SELECT msg.subject, u.first_name || ' ' || u.last_name as sent_to, msg.create_date, SUM (is_read) AS summ, COUNT (is_read) AS countt FROM message_recipient msgrec, MESSAGE msg, user u WHERE msgrec.message_id = msg.id AND msgrec.recipient_id = u.id AND msgrec.recipient_group_id IS NULL GROUP BY msg.subject, name, msg.create_date);
|Subject||Sent to||Sent||Read %|
|Project delivery due on Tuesday||Project Delivery Team||9/13/2015 08:15||42%|
|Meet me on Monday||John D||9/10/2015 13:30||100%|
|Sync up dev environment with production||DBA team||9/9/2015 09:11||80%|
|Closing up NCRs of audit||NSS-team||9/9/2015 17:50||45%|
For a reminding functionality, I will add the following columns in the message table:
Is_reminder– This column flags whether or not a reminder is required for the message.
Reminder_frequency_id– This column signifies the frequency of the reminder. Should it be on daily basis or weekly basis?
Next_remind_date– This column holds the date when the next reminder needs to be sent. The reminder will be sent on the
next_remind_datefor the users for whom the ‘is_read’ flag is still ZERO. A new value for this column will be calculated every time a reminder is sent.
Expiry_date– This column is the cut-off date when reminders will no longer be sent to users.
Calculation of the
next_remind_date would be as follows – Suppose one message is sent to users on 9/14, Monday with 10/5 as an expiry date for it. The message is sent with a weekly frequency of reminders. In this case, reminders will be sent to users on 9/21 and 9/28 to respond to them on email, and one last time on 10/5 to urge them to respond in the next 24 hours.
Final Data Model
One of the best usages of this messaging system is to send notifications to users who have been inactive in the system for a long time. These notifications can be sent with a reminding mechanism enabled, and notifications will be sent to users until users respond to the notification. Users will be de-activated on and after the expiry date if no response to the notifications is received from them.
I intended to build a data model for a fully functional messaging system, which can be fit into a variety of systems to send messages/ notifications. Feel free to share your views/ inputs/ comments on the article.