You are the owner of an online store, located in Poland. The majority of your customers are from Poland and they speak Polish. But you want to sell your products abroad too and your international customers mainly speak English. So you want your online store to be available in both Polish and English. You also expect that your products will sell well in France, so you anticipate that you’ll have to prepare a French version of the store as well (and maybe Spanish too, because why not?).
You want your users to be able to switch from the Polish version
to the English version and back.
And obviously you want the product details to switch from Polish to English.
How Do You Make a Multi-Language Web Application?
There are two types of text in your application. One is static data: button labels, table headers, graphics (which often contain text). The other is the user-defined data, such as product name, product price, product description and so on. The data are normally taken from the database.
The static data are what you would want to write as a string literal in your output. The user-defined data are data which you take from your database.
I won’t talk about static data today. Any reasonable web framework will handle internationalization of the static data. Consult the documentation of your web application framework for details. Look for keywords like “internationalization,” “i18n,” “localization,” or “translations.”
Today I will talk about what database structure we usually use here at e-point to handle multi-language data. In the database for your store, you probably have the
product table which stores info about all products available in the store.
product table has columns like
price. When you translate the product info into other languages, you only have to translate some columns. Here you would translate only the
description, but the
price does not change when you switch languages.
When we add support for multiple languages we add a new table called
language_version, which stores all language versions available in the store. We usually add a column called
code and one called
is_default (with an appropriate constraint: only one version can be the default).
Next, we split the
product table into two tables: table
product and table
product_lv. For every product, there will be one row in the
product table and multiple rows in the
product_lv table; one row for each language version. The table
product_lv only contains columns which have to be translated:
description. The columns that are language-independent (like
price, because you sell for the same price no matter if your customer speaks English or Polish) stay in the table
We do the same for every table which contains translated data. The translated data go to the
table_lv table, the language-independent data stay in the main table.
Pros and Cons
An obvious disadvantage is that all create, retrieve, update, or delete (CRUD) operations get a little bit more complicated. You always have to join with an additional language version table to get the right description.
The advantage of this design is you don’t have to change your database schema when you add a new language version. I’m not saying that adding a new language version is easy. After all, you have to translate ALL product descriptions. This is an organizational challenge but from the database point of view it’s fairly easy: plenty of inserts.
How do you design your multi-language databases?