Back to articles list
- 9 minutes read

A Smart Way to Build Skills – An E-Learning Portal Data Model

E-learning (or online learning) offers people a flexible and cost-effective way to pick up new skills. What’s behind these popular portals?

Online learning has become an attractive way for busy people to expand their educational and technical horizons. Being able to learn what you need, at your own pace, and on your own schedule is appealing. Plus, these courses are priced much lower than their traditional counterparts. Since there are no classrooms, no full-time instructors, and a high reusability factor (once a course is produced, it can be accessed by many learners), online courses are very cost-effective.

E-learning portals like Coursera, Udemy, and LearnSQL.com have plenty of courses that anyone, irrespective of their educational and professional background, can use. Let’s take a look at the data model behind this kind of site. We’ll be focusing on the services offered.

How Does an E-Learning Portal Work?

If you’re not already familiar with e-learning, this is how it works:

  • Various courses are published on a portal.
  • Each course consists of one or more chapters. Each chapter can have one or more of the following types of content: text, recorded lectures, explainer videos, written assignments, practical or hands-on projects, etc.
  • A course is created and published by one person (a.k.a. the course instructor). We’ll assume that there will be one and only one instructor per course.
  • There are no restrictions on the maximum number of people that can be enrolled in a course. From now on, we will call the people enrolled in a course “students”.
  • A student can enroll in any number of courses.
  • Many portals offer some of their course content for free. This lets students try out the course before committing to it. Simply enrolling in this type of course is enough to get started, but finishing it often requires paying a modest fee.
  • Once they complete a course, students are asked to submit feedback scores and rate the course.

The Data Model




This data model meets the requirements set out in the previous subheading. It is divided into four different subject areas. I’ll explain each of them.

Subject Area 1: Actors

Considering the business context of an e-learning portal, there are two major types of actors: (1) the instructors who create and publish courses, and (2) the students who enroll in courses. There can be other actors using the app (like admins, bloggers, editors, etc), but we’ll focus on these two key actors.

Subject Area 1: Actors

The Student table holds the details students enter during registration. Students can register using an email address, and they can enroll in any course. (The application can later use some of the fields below to track student progress.) The columns in this table are:

  • id – A unique number assigned to each student during registration.
  • first_name – The student’s first name.
  • last_name – The student’s last name.
  • email – A mandatory column that holds a student’s email address. All communications are carried out using the data in this column.
  • registration_date – The date a student registers with the portal.
  • num_of_courses_enrolled – The current number of courses the student is taking. By default, the column is initialized as “0”.
  • num_of_courses_completed – The number of courses the student has completed so far. By default, the column is initialized as “0”.

Like the student table, the instructor table holds information regarding individual instructors. The only difference is that not everyone is allowed to register as an instructor. There are usually some offline processes in place to recruit instructors. Only authorized individuals can submit course content, and the submitted content goes through a rigorous plagiarism check before it gets published on the portal. The columns in this table are:

  • id – A unique number assigned to each instructor.
  • first_name – The instructor’s first name.
  • last_name – The instructor’s last name.
  • email – The instructor’s email address.
  • registration_date – The date when the instructor registered on the portal.
  • qualification – The highest qualification (i.e. Ph.D, MSCE) held by the instructor.
  • introduction_brief – A brief summary of an instructor’s educational background, work experience, skill sets, hobbies, etc. This will be supplied by the instructor and published on the portal.
  • image – A profile image of an instructor.
  • “num_of_published_courses” – The total number of courses published by the instructor.
  • num_of_enrolled_students – The total number of students enrolled in all courses published by the instructor.
  • average_review_rating – The average review rating (on a scale of 1 (lowest) to 5 (highest)) for an instructor. An average rating is calculated based on the total reviews submitted by students.
  • num_of_reviews – The total number of reviews submitted for the instructor.

You must be wondering why I created separate tables for students and instructors when both could be stored in one table. The reason is the ratio of students to instructors. Online learning portals have gotten so popular that one instructor may have courses being taken by thousands of students.

Also, there is a valid reason for keeping aggregated columns like num_of_enrolled_students and num_of_reviews separate. A successful portal may have millions of students. When this huge volume of data resides in your transaction tables (the enrollment and feedback tables, which we will discuss later), it is always better to store aggregated values within the database itself. But it is not necessary to keep updates in real time. These are just columns to hold aggregated values, and these values may be computed once a day and stored per convenience, i.e. each night. Computing these values every time they are requested is not advisable, considering the millions of records in underlying tables.

Subject Area 2: Courses and Their Details

This subject area describes courses, course chapters, and the various types of content in each chapter.

Subject Area 2: Courses and Their Details

The course table holds basic data about courses. The columns in this table are:

  • id – A unique ID number assigned to each course.
  • course_title – The course title.
  • course_brief – A brief description of the course.
  • instructor_id – The creator of the course; references the instructor table.
  • num_of_chapters – The number of chapters in the course.
  • course_fee – How much the portal charges for the course.
  • language_id – The language the course is delivered in; this references the language table. Many portals publish courses in several languages to encourage as many people as possible to enroll.

Each course is divided into one or more chapters. The course_chapter table stores the following records for each chapter of each course:

  • id – A unique number that identifies each chapter.
  • course_id – The associated course.
  • chapter_title – The chapter’s title.
  • num_of_reading – The number of reading activities included in the chapter.
  • num_of_video – The number of videos included in the chapter.
  • num_of_assignment – The number of assignments associated with the chapter.

The Num_of_reading, num_of_video, and num_of_assignment columns give a fair idea of each chapter’s complexity and how long students will take to complete it.

Each chapter consists of various types of content. The course_chapter_content table holds content details for each chapter. The columns in this table are:

  • id” – A unique number identifying each piece of content.
  • course_chapter_id – The relevant chapter ID.
  • content_type_id – The type of content – e.g. article, video, recorded lecture, podcast, or assignment.
  • is_mandatory – Whether completing the content is mandatory for the course.
  • time_required_in_sec – The number of seconds estimated for students to complete the content. The content’s creator must understand how much time it takes the average student to finish the content. This column plays an important role in computing the estimated time required to complete a chapter or course.
    Note: Time required to complete a chapter = Sum of the times required to complete each piece of related content.
    Time required to complete a course = Sum of the times required to complete each related chapter.
  • is_open_for_free – This column signifies whether the content is available to students using a free enrollment rather than a paid enrollment.

Subject Area 3: Enrollment and Course Progress

The Enrollment and Course Progress subject area is the core of this data model. It describes what is captured when students enroll and how their progress is tracked in the portal.

Subject Area 3: Enrollment and Course Progress

The enrollment table stores enrollment information. A record is inserted into this table whenever a student enrolls for a course (irrespective of free or paid enrollment). The columns for this table are:

  • id – A unique number given to each enrollment.
  • student_id – The ID of the enrolled student.
  • course_id – The ID of the relevant course.
  • enrollment_date – The date when the enrollment took place.
  • is_paid_subscription – Whether the enrollment is free or paid.

The learning_progress table holds details about students’ progress through a course. The columns in this table are:

  • id – The primary key of the table.
  • enrollment_id – References the enrollment table and shows which student is taking which course.
  • course_chapter_content_id – The ID for the chapter content related to this record.
  • begin_timestamp – The timestamp when the student starts the content item.
  • completion_timestamp – The timestamp when the student finishes the content item.
  • status – The current status of the content (i.e. “P” for in progress, “C” for complete).

Note: Records in this table will be inserted only when a student begins on a content item. If there are no records for a particular chapter and enrollment pair in this table, it means that that content has not yet been accessed by the student. Also, multiple records for a content item-enrollment pair are not allowed even if a student goes through the content multiple times. In that case, only the completion_timestamp value will be updated with the most recent time. The begin_timestamp column can only contain the timestamp when the content was originally accessed.

Subject Area 4: Feedback Submission

Capturing and publishing feedback enables instructors to enhance their courses and also helps students choose the most appropriate course for their needs.

Subject Area 4: Feedback Submission

The feedback table stores feedback and reviews submitted by students. It contains the following columns:

  • id” – The primary key for the table.
  • enrollment_id – This column, which refers to the enrollment table, provides student and course details.
  • rating_score – Contains a NUMBER value that represents an individual student’s rating of a course. This value ranges from 1 (worst) to 5 (best).
  • feedback_text – An unstructured text column that stores a student’s detailed written feedback. This is often an optional field, so I have made it nullable.
  • submission_date – The date when feedback is submitted.

Why did we connect the feedback table with the enrollment table? By avoiding a direct connection with the course table (or for that matter, the instructor table) we minimize fake feedback submissions. This way, only students can submit feedback, and only after they complete the course.

What Would You Add to This Data Model?

As users of LearnSQL.com and other online learning portals know, we are never really done with a subject. There’s always something else to explore. With that in mind, what would you add to this data model? What would you change?

What do you think about adapting this model to a similar but different purpose? In the past, we published a series on managing online quizzes. How would you modify this model to capture assignment results?

go to top