
In the database world, there are some things that are universally agreed upon. Increased RAM is largely beneficial to DMBS systems. Spreading out data and log files on RAID improves performance.
Naming conventions are not one of those things.
This is a surprisingly polarizing topic, with the proponents of various methodologies firmly entrenched in their positions. And very vocal and passionate in their defense of the same.
This article will delve into some of the specific conventions and the arguments on both sides, while attempting to present a reasonable conclusion for each point.
The Great Pluralization Debate
At its core, this is a simple topic. For example, what is the correct way to name a table that contains customer information in a relational database schema? Is it Customer
or Customers
?
Arguments on both sides abound.
At first glance, it’s natural to think of a collection of objects in the plural. A group of several individuals or companies would be Customers. Therefore, a table (being a collection of objects) should be named in the plural. An individual row in that table would be a single customer.
The ISO/IEC naming principles, while dated, recommend pluralized table names and singular column names.
Most SQL Server system tables use plural names (sysnotifications, sysoperators), but this is inconsistent. Why sysproxylogin and not sysproxylogins?
In arguments for plural table names, rows in a table are also referenced as ‘instances’ of a whole – similar to items in a collection. Customers defines the whole set; a single customer is an instance of Customers.
Conversely, there are many reasons to use singular object names.
While there may be many items (or customers) in a table, the table itself can be considered a single entity. box of customers isn’t “a boxes of customers”, even if it has a large number of customers inside. Additionally, there could be only one item – or none – inside the table, making “customers” a misnomer.
If you choose to alter the table name based on word variants, inconsistencies can quickly emerge. While many words will be straightforward (Customer becomes Customers, Product becomes Products), other words may not be. In this case, Person could become People or Persons; a singular Moose would look the same as its plural form, Moose. (Although why would you need a table of moose?) A convention such as People.FirstName starts to become confusingly unclear.
If multiple languages are involved, the situation gets even worse. Because pluralization of words can vary in so many ways (customers, mice, moose, children, crises, syllabi, aircraft), non-native speakers have additional challenges. Sticking with singular object names avoids this problem entirely.
The Case Convention Question
There isn’t the same fervor with case conventions as with pluralization, but arguments are made for several different options. They include:
- Pascal Case: The first letter of each concatenated word is capitalized, as in:
CustomerOrder
-
Camel Case: The first letter of the first word is lowercase; all subsequent concatenated words have a capitalized first letter, as in:
customerOrder
Pascal Case is sometimes considered a subtype of Camel Case, but Microsoft generally differentiates between the two.
For words less than three characters, it’s recommended to use only uppercase, as in
UI
orIO
. - Underscore [“C” Case]: Words are separated with underscores, as in either
Customer_Order
, orcustomer_Order
– even more decisions!
Researchers at Johns Hopkins University conducted a study on the efficiency of using underscores in programming object names. They found that use of Camel Case (or Pascal Case) improved typing accuracy and recognition. Underscores were widely used in C programming, but the trend is towards Camel/Pascal Case with recent emphasis on Microsoft and Java-style languages.
As with the other topics, following an established convention is more important than the selection of the convention itself.
An additional consideration here is the case-sensitivity of the database. SQL Server collation determines this sensitivity with ‘CS’ (case-sensitive) or ‘CI’ (case-insensitive) in the collation name. For example:
SQL_Latin1_General_Cp437_CS_AS_KI_WI: Case Sensitive
SQL_Latin1_General_Cp437_CI_AS_KI_WI: Case Insensitive
In case-sensitive collations, Select * from myTable
would fail against the object MyTable
. This may make underscores slightly preferable to prevent confusion, but Intellisense also helps eliminate typing mistakes in most modern programming environments.
Other Naming Convention Considerations
The Singular vs. Plural Debate and the Great Case Question may be where the battle is the fiercest, but there are at least three more areas to keep in mind when considering a naming convention.
Avoid using any SQL Server reserved words as object names. This includes both tables and columns. For example – User, Time, and Date are reserved. Reserved keywords may require additional care to reference (such as using square brackets) depending on the calling application. This applies to spaces as well. Spaces in object names require quotes to reference.
This also ties in with another recommendation – precision. System.CreateDate is far clearer than System.Date. A well-designed model allows the viewer to immediately understand the purpose of the underlying objects. When any identifiers are to be referenced as foreign keys, be liberal in the name – Customer.CustomerID rather than Customer.ID.
Avoid prefixes and suffixes for tables and views, such as tblTable
. Hungarian notation (which was always intended to identify variable usage) slipped into common SQL Server naming conventions, but it is widely derided. Object identifiers should describe what is contained within, not the object itself.
However, prefixes are useful in SQL Server supporting objects, as they describe the functional nature of the object.
The following are commonly accepted prefixes for SQL Server objects:
- IX: Index
- PK: Primary Key
- FK: Foreign Key
- CK: Check Constraint
- DF: Default
- UQ is sometimes used for unique indexes as well.
This model illustrates the points defined above. It requires no explanation as to the nature of the data; singular naming conventions are used, and clear identifiers are in place.
In the end, there are advantages and disadvantages to each side of the convention naming debate. There is one key point, however, that both sides can agree upon: regardless of the decisions made, be consistent with the selected convention.
What SQL naming conventions do you use and why?