
Have you ever wondered what the DBMS (database management system) does when you create a new database, add tables and functions, or define primary keys? It’s not magic. The answer is actually very simple. The DBMS has a “system” database that it uses to store information about all the objects we’ve created in every database on that server instance.
In this article, we’ll take a look at the MySQL INFORMATION_SCHEMA database. This database can simplify many operations. With just a few queries, we can find all the databases on that server. We can also find information about the tables in each database, plus their primary keys, foreign keys, procedures, functions, etc. This makes life a lot easier, especially for DBAs.
Let’s start by looking at a few important tables and queries. Next, we’ll use them to analyze the database schema for an MMO games portal, as presented in this article.
What Is the INFORMATION_SCHEMA?
The INFORMATION_SCHEMA in MySQL is a database that stores metadata about all other databases on that server instance. It contains read-only views that we can query for information.
After we are connected to the INFORMATION_SCHEMA database, we can use the SHOW TABLES
command to get a list of all tables inside the INFORMATION_SCHEMA database. This query can also be used on any other database on the server.
Notice that SHOW TABLES
and SHOW DATABASES
are MySQL commands only, not ANSI-standard SQL commands. The SHOW TABLES
command returns the following list:
- CHARACTER_SETS
- COLLATIONS
- COLLATION_CHARACTER_SET_APPLICABILITY
- COLUMNS
- COLUMN_PRIVILEGES
- ENGINES
- EVENTS
- FILES
- GLOBAL_STATUS
- GLOBAL_VARIABLES
- KEY_COLUMN_USAGE
- PARAMETERS
- PARTITIONS
- PLUGINS
- PROCESSLIST
- PROFILING
- REFERENTIAL_CONSTRAINTS
- ROUTINES
- SCHEMATA
- SCHEMA_PRIVILEGES
- SESSION_STATUS
- SESSION_VARIABLES
- STATISTICS
- TABLES
- TABLESPACES
- TABLE_CONSTRAINTS
- TABLE_PRIVILEGES
- TRIGGERS
- USER_PRIVILEGES
- VIEWS
- INNODB_BUFFER_PAGE
- INNODB_TRX
- INNODB_BUFFER_POOL_STATS
- INNODB_LOCK_WAITS
- INNODB_CMPMEM
- INNODB_CMP
- INNODB_LOCKS
- INNODB_CMPMEM_RESET
- INNODB_CMP_RESET
- INNODB_BUFFER_PAGE_LRU
We quickly notice that some tables have names like TABLES
, COLUMNS
, VIEWS
, and TRIGGERS
. These names imply the sort of data we should find in the said tables.
The Database We’ll Use
We’ll use the database for an MMO game portal, as described in this article.
I won’t go over the model again here. For our purposes, it’s important that we have:
- Tables
- Primary and foreign keys
- Indexes
We’ll use the values stored in the INFORMATION_SCHEMA tables to find all the information we want about our database.
Important MySQL INFORMATION_SCHEMA Queries
As promised, this section will demonstrate a few simple but important queries that we’ll need to extract information from the DBMS.
To list all databases
We’ll use the SHOW DATABASES
command to list all the databases on that MySQL server instance. The result of this query is a list of all database names, including the MySQL “system” databases: MYSQL
, INFORMATION_SCHEMA
and PERFORMANCE_SCHEMA
.
To list all tables with their details
The next thing we want to do is list all the tables in the desired database.
To get that information, we’ll need to query the TABLES
table. Before we do, I’ll quickly describe its most important attributes:
TABLE_SCHEMA
– the actual name of the schema that the table belongs toTABLE_NAME
– the actual name of the tableTABLE_TYPE
– either BASE TABLE or VIEW; temporary tables are not listed.DATA_LENGTH
andINDEX_LENGTH
– refer to the size of the data (estimated) and index in the tableCREATE_TIME
– the actual time when the table was originally createdUPDATE_TIME
– the time of the last change in that table’s dataTABLE_COLLATION
– the table collationTABLE_COMMENT
– actual comments made during modeling
If we don’t use a database name as a selection condition, we’ll get a list of all tables from every database on the server. That is also the case with all the other queries presented in this article.
This first query will simply give us the number of tables in the database.
SELECT `INFORMATION_SCHEMA`.`TABLES`.`TABLE_SCHEMA`, COUNT(*) AS number_of_tables FROM `INFORMATION_SCHEMA`.`TABLES` WHERE `INFORMATION_SCHEMA`.`TABLES`.`TABLE_SCHEMA` = "mmo_database" GROUP BY `INFORMATION_SCHEMA`.`TABLES`.`TABLE_SCHEMA`
The result of the query is:
A slightly modified query returns the actual list of the tables with their details:
SELECT `INFORMATION_SCHEMA`.`TABLES`.`TABLE_NAME`, `INFORMATION_SCHEMA`.`TABLES`.`TABLE_TYPE`, `INFORMATION_SCHEMA`.`TABLES`.`DATA_LENGTH`, `INFORMATION_SCHEMA`.`TABLES`.`INDEX_LENGTH`, `INFORMATION_SCHEMA`.`TABLES`.`CREATE_TIME`, `INFORMATION_SCHEMA`.`TABLES`.`UPDATE_TIME`, `INFORMATION_SCHEMA`.`TABLES`.`TABLE_COLLATION`, `INFORMATION_SCHEMA`.`TABLES`.`TABLE_COMMENT` FROM `INFORMATION_SCHEMA`.`TABLES` WHERE `INFORMATION_SCHEMA`.`TABLES`.`TABLE_SCHEMA` = "mmo_database"
There are 33 rows – one for each table. I selected the most interesting attributes from `INFORMATION_SCHEMA`.`TABLES` but there are many more.
To list all columns in all tables
As you might guess, the details about all columns are stored in the COLUMNS
table. Before we go further, I would like to describe the following columns:
COLUMN_NAME
andCOLUMN_TYPE
– the actual name and column typeCOLUMN_KEY
– this column can contain three values and a NULL value: PRI indicates that column is used as a primary key (or part of a primary key that is composed of multiple columns); UNI stands for UNIQUE; MUL is short for MULTIPLE. The MUL column has an index and can contain the same values many times. All columns in the picture below that have MUL values are used as foreign keys.IS_NULLABLE
– Is the column nullable (YES) or not (NO)?COLUMN_COMMENT
– actual comments written during the modeling process
The query below will list columns and selected details from our database:
SELECT `INFORMATION_SCHEMA`.`TABLES`.`TABLE_NAME`, `INFORMATION_SCHEMA`.`COLUMNS`.`COLUMN_NAME`, `INFORMATION_SCHEMA`.`COLUMNS`.`COLUMN_TYPE`, `INFORMATION_SCHEMA`.`COLUMNS`.`COLUMN_KEY`, `INFORMATION_SCHEMA`.`COLUMNS`.`IS_NULLABLE`, `INFORMATION_SCHEMA`.`COLUMNS`.`COLUMN_COMMENT` FROM `INFORMATION_SCHEMA`.`TABLES`, `INFORMATION_SCHEMA`.`COLUMNS` WHERE `INFORMATION_SCHEMA`.`TABLES`.`TABLE_SCHEMA` = "mmo_database" AND `INFORMATION_SCHEMA`.`TABLES`.`TABLE_SCHEMA` = `INFORMATION_SCHEMA`.`COLUMNS`.`TABLE_SCHEMA` AND `INFORMATION_SCHEMA`.`TABLES`.`TABLE_NAME` = `INFORMATION_SCHEMA`.`COLUMNS`.`TABLE_NAME` ORDER BY `INFORMATION_SCHEMA`.`TABLES`.`TABLE_NAME`, `INFORMATION_SCHEMA`.`COLUMNS`.`ORDINAL_POSITION`
The list is ordered first by the name of the table and then by the ordinal position of each column inside the table. The order should be the same as it is in our model. The snapshot of our result is:
All values are as expected. I would like to point out some possible values in the COLUMN_KEY
column:
PRI
– indicates that column is used as a primary key (or part of a primary key that is composed of multiple columns)UNI
– stands for UNIQUEMUL
– is short of Multiple; The MUL column has an index and can contain the same values multiple times.
To list all keys and find constraints
Let’s stick with the previously-mentioned PRI and MUL values and take a closer look at the details of primary and foreign key constraints.
The KEY_COLUMN_USAGE
table describes the columns used in primary, unique, and foreign keys. The most important attributes in this table are:
COLUMN_NAME
– the name of the column usedCONSTRAINT_NAME
– the name assigned to this constraint during modelingREFERENCED_TABLE_NAME
andREFERENCED_COLUMN_NAME
– references to the actual column (if the constraint type is a foreign key)
The code we need for this query is:
SELECT `INFORMATION_SCHEMA`.`KEY_COLUMN_USAGE`.`TABLE_NAME`, `INFORMATION_SCHEMA`.`KEY_COLUMN_USAGE`.`COLUMN_NAME`, `INFORMATION_SCHEMA`.`KEY_COLUMN_USAGE`.`CONSTRAINT_NAME`, `INFORMATION_SCHEMA`.`KEY_COLUMN_USAGE`.`REFERENCED_TABLE_NAME`, `INFORMATION_SCHEMA`.`KEY_COLUMN_USAGE`.`REFERENCED_COLUMN_NAME` FROM `INFORMATION_SCHEMA`.`KEY_COLUMN_USAGE` WHERE `INFORMATION_SCHEMA`.`KEY_COLUMN_USAGE`.`TABLE_SCHEMA` = "mmo_database" AND `INFORMATION_SCHEMA`.`KEY_COLUMN_USAGE`.`TABLE_NAME` = "action_allowed"
The result contains a primary key (id
), an alternate key composed of two attributes (membership_type_id
and membership_actions_id
) and two foreign keys (membership_type_id
and membership_actions_id
) with their references.
We could use the KEY_COLUMN_USAGE
table to check if every table in our database has a primary key. If we used the same naming convention in all our models, we could use predefined queries to perform checks like:
- Does every table in the database have a primary key called
id
? - Is every
_id column a foreign key? Does the table it references exist? - If a table consists entirely of foreign keys and a primary key, is an alternate key set over all the foreign keys?
For a less-detailed overview of the constraints in various tables, we could use the TABLE_CONSTRAINTS
table. In this table:
- CONSTRAINT_NAME – is the actual name of the constraint; and
- CONSTRAINT_TYPE – can be PRIMARY KEY, UNIQUE or FOREIGN_KEY.
This query looks like:
SELECT `INFORMATION_SCHEMA`.`TABLE_CONSTRAINTS`.`TABLE_NAME`, `INFORMATION_SCHEMA`.`TABLE_CONSTRAINTS`.`CONSTRAINT_NAME`, `INFORMATION_SCHEMA`.`TABLE_CONSTRAINTS`.`CONSTRAINT_TYPE` FROM `INFORMATION_SCHEMA`.`TABLE_CONSTRAINTS` WHERE `INFORMATION_SCHEMA`.`TABLE_CONSTRAINTS`.`TABLE_SCHEMA` = "mmo_database" AND `INFORMATION_SCHEMA`.`TABLE_CONSTRAINTS`.`TABLE_NAME` = "action_allowed"
The result is shown below.
To list all indexes
Indexes are hidden in the “STATISTICS” table.
Let’s take a closer look at its columns:
COLUMN_NAME
– the actual name of the column that is part of that indexINDEX_NAME
– the index nameINDEX_TYPE
– the actual type of the index. Possible values are: BTREE, RTREE, HASH and FULLTEXT.
Again, the code is:
SELECT `INFORMATION_SCHEMA`.`STATISTICS`.`TABLE_NAME`, `INFORMATION_SCHEMA`.`STATISTICS`.`COLUMN_NAME`, `INFORMATION_SCHEMA`.`STATISTICS`.`INDEX_NAME`, `INFORMATION_SCHEMA`.`STATISTICS`.`INDEX_TYPE` FROM `INFORMATION_SCHEMA`.`STATISTICS` WHERE `INFORMATION_SCHEMA`.`STATISTICS`.`TABLE_SCHEMA` = "mmo_database" AND `INFORMATION_SCHEMA`.`STATISTICS`.`TABLE_NAME` = "action_allowed"
This query will return a list of all the indexes created over the action_allowed
table.
To See Procedures and Functions
The ROUTINES
table contains information about stored procedures and functions (with the exception of user-defined functions). Its most important attributes are:
ROUTINE_NAME
– the actual name of the procedure or functionROUTINE_TYPE
– either PROCEDURE or FUNCTIONROUTINE_DEFINITION
– contains the entire routine code
The following query will list all procedures and functions in our database:
SELECT `INFORMATION_SCHEMA`.`ROUTINES`.`ROUTINE_NAME`, `INFORMATION_SCHEMA`.`ROUTINES`.`ROUTINE_TYPE`, `INFORMATION_SCHEMA`.`ROUTINES`.`ROUTINE_DEFINITION` FROM `INFORMATION_SCHEMA`.`ROUTINES` WHERE `INFORMATION_SCHEMA`.`ROUTINES`.`ROUTINE_SCHEMA` = "mmo_database"
Ideas on How to Use INFORMATION_SCHEMA
As mentioned before, INFORMATION_SCHEMA is very useful for DBAs. You can utilize it to validate rules inside the database, for one thing. But this isn’t the limit of its usability: developers could use it in scripts when they make changes to an existing database. For example, they could use INFORMATION_SCHEMA as part of a script that checks if a column exists before adding it to a table. Finally, INFORMATION_SCHEMA can be used in documenting a database.
The INFORMATION_SCHEMA database or its equivalent is present in almost every DBMS today. At first, one might think that its only use is to show the current status of databases on the server. And it can certainly do that. But if you employ the same naming convention across databases, you can use INFORMATION_SCHEMA to implement checks. This is very useful, and not only for DBAs.
There is much more to INFORMATION_SCHEMA. The best way to discover its other options is simply to play with it. Please share how you’ve used it and what you’ve discovered about it in the comments section below.