A Grocery Delivery Data Model

Emil Drkušić, Database designer and developer, financial analyst

by
Emil Drkušić
Database designer and developer, financial analyst

Posted: May 21, 2018

groceries, delivery, data model

If there’s a way to order groceries online, why not use it? This article examines the data model behind a grocery store’s delivery system.

We still get a special feeling from picking something from the garden and then preparing it right away – but it’s not something we can do often. Today’s fast pace doesn’t allow it. In fact, sometimes it doesn’t even allow us to go to the store to “pick” our groceries. So it makes sense to save ourselves some time and use an app to order what we need. Our order will just show up at our home. Maybe we won’t get that special fresh-picked feeling, but there will be food on our table.

The data model behind such an application is the topic of today’s article.

What Do We Need for a Grocery Delivery Data Model?

The idea of this model is that an application (web, mobile, or both) will allow registered customers to create an order (made up of products from our store). Then this order will be delivered to the customer. We will obviously store customer data and a list of all available products to support this.

Customers can place multiple orders that include different items in different quantities. When a customer’s order is received, store employees should be notified so they can find and pack the needed items. (This may require one or more containers.) Finally, the containers will be delivered, either together or separately.

In the app itself, customers and employees should be able to insert notes and rate the other party after the delivery has been made.

The Data Model




The data model consists of three subject areas:

  • Items & units
  • Customers & employees
  • Orders

We’ll present each subject area in the order it is listed.

Section 1: Items and Units

Section 1: Items and Units

We’ll start with the Items & units subject area. Although it’s a small part of our model, it contains two very important tables.

The unit table stores information about the units we’ll assign to any item in our inventory. For each value in this table, we’ll store two UNIQUE values: unit_name (e.g. “kilogram”) and unit_short (e.g. “kg”). Notice that unit_short is the abbreviation for unit_name.

The second table in this subject area is item. It lists all the items we have in inventory. For each item, we’ll store:

  • item_name – The UNIQUE name we’ll use for that item.
  • price – The current price of that item.
  • item_photo – A link to a photo of this item.
  • description – Additional textual description of the item.
  • unit_id – References the unit dictionary and denotes the unit used to measure that item.

Please note that I’ve omitted a few things here. The most important one is a flag that denotes if an inventory item is currently being offered for sale. Why don’t we have this? It would require at least one additional field (the flag) as well as another table (to store historical changes for each item). To keep things simple, I assumed that all the items we have in inventory are also available to sell.

The second important thing I omitted is tracking the warehouse status. My assumption is that we ship everything from one central warehouse and that we’ll always have available items. If we don’t have an item, we’ll simply notify the customer and offer them a similar item in replacement.

Section 2: Customers and Employees

Section 2: Customers and Employees

The Customers & employees subject area contains all the tables needed to store customer and employee data. We’ll use this information in the central part of our model.

The employee table contains a list of all relevant employees (e.g. the grocery packers and the delivery people). For each employee, we’ll store their first_name and last_name and a UNIQUE employee_code value. Although the id column is also UNIQUE (and this table’s primary key), it’s better to use another, real-world value (e.g. a VAT number) as an employee identifier. Thus we have the employee_code field.

Notice that I haven’t included employee login details, employee roles, and a way of tracking role history. These can be easily added, as described in this article.

Now we’ll add customers to our model. This will take two more tables.

Customers will be segmented geographically, so we’ll need a city dictionary. For each city where we offer grocery delivery, we’ll store the city_name and the postal_code. Together, these form the alternate key of this table.

Customers are definitely the most important part of this model; they are the ones who initiate the whole process. We’ll store a complete list of our customers in the customer table. For each customer, we’ll store the following:

  • first_name – The first name of the customer.
  • last_name – The last name of the customer.
  • user_name – The username the customer selected when setting up their account.
  • password – The password the customer chose when setting up their account.
  • time_inserted – The moment when this record was inserted in the database.
  • confirmation_code – A code that was generated during the registration code. This code will be used to verify their email address.
  • time_confirmed – When email confirmation happened.
  • contact_email – The customer’s email address, which is also used as the confirmation email.
  • contact_phone – The customer’s phone number.
  • city_id – The ID of the city where the customer resides.
  • address – The customer’s home address.
  • delivery_city_id – The ID of the city where the customer’s order should be delivered.
  • delivery_address – The preferred delivery address. Note that this can be (but doesn’t have to be) the same as the customer’s home address.

Section 3: Orders

Section 3: Orders

The central and the most important part of this model is the Orders subject area. Here we’ll find all the tables needed to place an order and to track items until they are delivered to customers.

The whole process starts when a customer places an order. A list of every order ever placed is in the placed_order table. I’ve intentionally used this name and not “order” because order is an SQL reserved keyword. For each order, we’ll store:

  • customer_id – The ID of the customer that placed this order.
  • time_placed – The timestamp when this order was placed.
  • details – All details related to that order, in unstructured textual format.
  • delivery_city_id – A reference to the city where this order should be delivered.
  • delivery_address – The address where this order should be delivered.
  • grade_customer & grade_employee – Grades given by the employee and customer after an order is completed. Until that moment, this attribute contains a NULL value. A customer’s grade indicates how happy they were with our service; an employee’s grade gives us info about what to expect the next time that customer places an order.

During the process of placing an order, a customer will select one or more items. For each item, they’ll define a desired quantity. A list of all items related to each order is stored in the order_item table. For each record in this table, we’ll store IDs for the related order (placed_order_id), item (item_id), the desired quantity, and the price when this order was placed.

In addition to what they want delivered, customers will also define their desired delivery time. For each order, we’ll create one record in the delivery table. This will record the delivery_time_planned and insert additional textual notes. The placed_order_id attribute will also be defined when this record is inserted. The remaining two attributes will be defined when we assign that delivery to an employee (employee_id) and when the order was delivered (delivery_time_actual).

While it might look like we’ll only have one delivery per order, that might not always be the case. We may need to make two or more deliveries per order, and that is the main reason why I chose to put delivery data in a new table.

When we start processing an order, employees will pack the items in one or more boxes. Each box will be defined UNIQUELY by its box_code and will be assigned to a delivery (delivery_id). We’ll also store the ID of the employee who prepared that box.

Each box will contain one or more items. Therefore, in the item_in_box table, we’ll need to store references to the box table (box_id) and the item table (item_id), as well as the quantity placed in that box. The last attribute, is_replacement, denotes if an item is a replacement for another item. We can expect that an employee will contact a customer before putting a replacement item in a box. One outcome of that action could be that a customer agrees with the replacement item; another could be the cancelling of the whole order.

The remaining three tables in the model are closely related to statuses and comments.

First, we’ll store all possible statuses in the status_catalog. Each status is UNIQUELY defined by its status_name. We can expect statuses like “order created”, “order placed”, “items packed”, “in transit” and “delivered”.

Statuses will be assigned to orders either automatically (after some parts of the process are completed) or, in some cases, manually (e.g. if there is a problem with the order). All available order statuses are stored in the order_status table. Besides foreign keys from two tables (status_catalog and placed_order), we’ll store the actual timestamp when this status was assigned (status_time) and any additional details in textual format.

The final table in this model is the notes table. The idea behind this table is to insert all additional comments related to a given order (placed_order_id). Comments can be inserted by employees or customers. For each record, only one of the employee_id or customer_id fields will contain a value; the other will be NULL. We’ll store the moment when this note was inserted into the system (note_time) and the note_text.

What Changes Would You Make to the Grocery Delivery Data Model?

Today we discussed a data model that could support web and mobile grocery delivery apps – both from the customer and from the employee perspective. As already mentioned in this article, there are a lot of ways to improve this model. Feel free to add your suggestions. Tell us what you would add to this model or remove from it. Or maybe you would organize this structure completely differently. Let us know in the comments section!

 
 

Try our online database modeler. No registration. No commitments.

 
 
Tags
 
Subscribe to our newsletter

If you find this article useful, join our weekly newsletter to be notified about the latest posts.

 
 
 
New SQL Course! Find out how to identify all the factors of a successful visualization: think like your audience, find the right chart type, and take out the meaning of the data. View course Discover our other courses: