If there’s a way to order groceries online, why not use it? This article examines the data model behind a grocery store’s delivery system.
We still get a special feeling from picking something from the garden and then preparing it right away – but it’s not something we can do often. Today’s fast pace doesn’t allow it. In fact, sometimes it doesn’t even allow us to go to the store to “pick” our groceries. So it makes sense to save ourselves some time and use an app to order what we need. Our order will just show up at our home. Maybe we won’t get that special fresh-picked feeling, but there will be food on our table.
The data model behind such an application is the topic of today’s article.
What Do We Need for a Grocery Delivery Data Model?
The idea of this model is that an application (web, mobile, or both) will allow registered customers to create an order (made up of products from our store). Then this order will be delivered to the customer. We will obviously store customer data and a list of all available products to support this.
Customers can place multiple orders that include different items in different quantities. When a customer’s order is received, store employees should be notified so they can find and pack the needed items. (This may require one or more containers.) Finally, the containers will be delivered, either together or separately.
In the app itself, customers and employees should be able to insert notes and rate the other party after the delivery has been made.
The Data Model
The data model consists of three subject areas:
Items & units
Customers & employees
We’ll present each subject area in the order it is listed.
Section 1: Items and Units
We’ll start with the
Items & units subject area. Although it’s a small part of our model, it contains two very important tables.
unit table stores information about the units we’ll assign to any item in our inventory. For each value in this table, we’ll store two UNIQUE values:
unit_name (e.g. “kilogram”) and
unit_short (e.g. “kg”). Notice that
unit_short is the abbreviation for
The second table in this subject area is
item. It lists all the items we have in inventory. For each item, we’ll store:
item_name– The UNIQUE name we’ll use for that item.
price– The current price of that item.
item_photo– A link to a photo of this item.
description– Additional textual description of the item.
unit_id– References the
unitdictionary and denotes the unit used to measure that item.
Please note that I’ve omitted a few things here. The most important one is a flag that denotes if an inventory item is currently being offered for sale. Why don’t we have this? It would require at least one additional field (the flag) as well as another table (to store historical changes for each item). To keep things simple, I assumed that all the items we have in inventory are also available to sell.
The second important thing I omitted is tracking the warehouse status. My assumption is that we ship everything from one central warehouse and that we’ll always have available items. If we don’t have an item, we’ll simply notify the customer and offer them a similar item in replacement.
Section 2: Customers and Employees
Customers & employees subject area contains all the tables needed to store customer and employee data. We’ll use this information in the central part of our model.
employee table contains a list of all relevant employees (e.g. the grocery packers and the delivery people). For each employee, we’ll store their
last_name and a UNIQUE
employee_code value. Although the id column is also UNIQUE (and this table’s primary key), it’s better to use another, real-world value (e.g. a VAT number) as an employee identifier. Thus we have the
Notice that I haven’t included employee login details, employee roles, and a way of tracking role history. These can be easily added, as described in this article.
Now we’ll add customers to our model. This will take two more tables.
Customers will be segmented geographically, so we’ll need a
city dictionary. For each city where we offer grocery delivery, we’ll store the
city_name and the
postal_code. Together, these form the alternate key of this table.
Customers are definitely the most important part of this model; they are the ones who initiate the whole process. We’ll store a complete list of our customers in the
customer table. For each customer, we’ll store the following:
first_name– The first name of the customer.
last_name– The last name of the customer.
user_name– The username the customer selected when setting up their account.
password– The password the customer chose when setting up their account.
time_inserted– The moment when this record was inserted in the database.
confirmation_code– A code that was generated during the registration code. This code will be used to verify their email address.
time_confirmed– When email confirmation happened.
contact_email– The customer’s email address, which is also used as the confirmation email.
contact_phone– The customer’s phone number.
city_id– The ID of the
citywhere the customer resides.
address– The customer’s home address.
delivery_city_id– The ID of the
citywhere the customer’s order should be delivered.
delivery_address– The preferred delivery address. Note that this can be (but doesn’t have to be) the same as the customer’s home address.
Section 3: Orders
The central and the most important part of this model is the
Orders subject area. Here we’ll find all the tables needed to place an order and to track items until they are delivered to customers.
The whole process starts when a customer places an order. A list of every order ever placed is in the
placed_order table. I’ve intentionally used this name and not “order” because order is an SQL reserved keyword. For each order, we’ll store:
customer_id– The ID of the
customerthat placed this order.
time_placed– The timestamp when this order was placed.
details– All details related to that order, in unstructured textual format.
delivery_city_id– A reference to the
citywhere this order should be delivered.
delivery_address– The address where this order should be delivered.
grade_employee– Grades given by the employee and customer after an order is completed. Until that moment, this attribute contains a NULL value. A customer’s grade indicates how happy they were with our service; an employee’s grade gives us info about what to expect the next time that customer places an order.
During the process of placing an order, a customer will select one or more items. For each item, they’ll define a desired quantity. A list of all items related to each order is stored in the
order_item table. For each record in this table, we’ll store IDs for the related order (
placed_order_id), item (
item_id), the desired quantity, and the
price when this order was placed.
In addition to what they want delivered, customers will also define their desired delivery time. For each order, we’ll create one record in the
delivery table. This will record the
delivery_time_planned and insert additional textual notes. The
placed_order_id attribute will also be defined when this record is inserted. The remaining two attributes will be defined when we assign that delivery to an employee (
employee_id) and when the order was delivered (
While it might look like we’ll only have one delivery per order, that might not always be the case. We may need to make two or more deliveries per order, and that is the main reason why I chose to put delivery data in a new table.
When we start processing an order, employees will pack the items in one or more boxes. Each
box will be defined UNIQUELY by its
box_code and will be assigned to a delivery (
delivery_id). We’ll also store the ID of the employee who prepared that box.
Each box will contain one or more items. Therefore, in the
item_in_box table, we’ll need to store references to the
box table (
box_id) and the
item table (
item_id), as well as the quantity placed in that box. The last attribute,
is_replacement, denotes if an item is a replacement for another item. We can expect that an employee will contact a customer before putting a replacement item in a box. One outcome of that action could be that a customer agrees with the replacement item; another could be the cancelling of the whole order.
The remaining three tables in the model are closely related to statuses and comments.
First, we’ll store all possible statuses in the
status_catalog. Each status is UNIQUELY defined by its
status_name. We can expect statuses like “order created”, “order placed”, “items packed”, “in transit” and “delivered”.
Statuses will be assigned to orders either automatically (after some parts of the process are completed) or, in some cases, manually (e.g. if there is a problem with the order). All available order statuses are stored in the
order_status table. Besides foreign keys from two tables (
placed_order), we’ll store the actual timestamp when this status was assigned (
status_time) and any additional
details in textual format.
The final table in this model is the
notes table. The idea behind this table is to insert all additional comments related to a given order (
placed_order_id). Comments can be inserted by employees or customers. For each record, only one of the
customer_id fields will contain a value; the other will be NULL. We’ll store the moment when this note was inserted into the system (
note_time) and the
What Changes Would You Make to the Grocery Delivery Data Model?
Today we discussed a data model that could support web and mobile grocery delivery apps – both from the customer and from the employee perspective. As already mentioned in this article, there are a lot of ways to improve this model. Feel free to add your suggestions. Tell us what you would add to this model or remove from it. Or maybe you would organize this structure completely differently. Let us know in the comments section!