How we scaled up web localization

March 29, 2022


0

If your website is only available in English, you may be losing out on a lot of user traffic and revenue. A recent study has shown that only around 25% of internet users speak English, which means that localizing our websites is a big priority at Nord Security. We currently support more than 20 locales and have country-specific content, which helps us reach out to customers all over the world. Read on to find out how we do it.

Localization blog

Single source of translations

First, let's define translations: a translation is a message that is localized for one or multiple languages. It has these attributes:

  • domain - a namespace for translations. Usually, each web app gets its separate domain. Example: nordvpn-web.

  • key - this is an English word or sentence that serves as a unique identifier for translations. Example: "How to set up a VPN on Mac".

  • locale - this is a unique locale code consisting of a language code, a hyphen separator, and a country code. It can have two formats:

    • A lowercase two-letter ISO 639-1 language code. Examples: en, de, fr.

    • A lowercase two-letter ISO 639-1 language code + ISO 3166-1 locale code. Examples: zh-tw, pt-br.

  • message - the actual translation, a localized message. Example: "Cómo configurar una VPN en Mac".

Initially, all of our translations were stored separately in each web project's files along with the code in our version control system. But that quickly became inconvenient, as we needed to add new translations and update old ones constantly. We needed a way to store translations in a centralized place that would allow us to update them across all our websites in real-time.

This is the diagram of the solution we came up with:

Localization blog 2

We'll briefly cover each part:

  • Translations microservice: holds all of our websites' translations and provides a REST API to run CRUD operations. Caches responses in Redis;

  • Admin panel: allows content managers to manage all the translations. Has useful functions like batch translation CSV file import and export;

  • Website (nordvpn.com): requests translations microservice for translation messages and caches them in Redis.

Refreshing translations in real-time is done by issuing a command via the Admin Panel. When the content manager adds new translations to a domain and wants to see them appear on our websites, he clicks the refresh button for that domain. This sends a command through our RabbitMQ broker to all the listening website consumers that they need to go and refresh the translations for that domain. After downloading the new translations, they are cached in Redis for fast lookup.

At the moment of writing this, we have 5+ localized websites in 26 locales and over 500,000 translated messages. The system scales well and has stood the test of time as new websites are added.

JSX content? No problem

1
<Page>
2
<PricingOffer
3
title={{
4
text: <T id="Choose your plan" />,
5
}}
6
subtitle={<T id="Encrypt your internet connection to protect your data and privacy." />}
7
/>
8
</Page>

Our web content team writes JSX, which allows us to leverage the power of React components. But browsers can't render JSX, so we needed a way to which it later gets translated and converted to static HTML. To solve this, we've made a system for writing JSX in the content editor and then converting it into static HTML.

Localization blog 1

The system consists of these parts:

  • Content store + editor;

  • Static Content Generator;

  • Translations microservice;

  • Web app.

When a content manager updates a blog post or publishes a page in the editor, the content is published as a message to RabbitMQ:

1
```
2
{
3
"locales": ["en", "fr", "es"],
4
"content": "<T message='Hello'>"
5
}
6
```

It is then sent to the Static content generator microservice, which converts the JSX content to normal HTML and adds translations from the Translations microservice. After this is done, the (now HTML) post content is sent back to the web app through RabbitMQ again in separate messages:

1
2
```
3
{
4
"locale": "en",
5
"content": "<div>Hello!</div>"
6
}
7
8
"locale": "fr",
9
"content": "<div>Bonjour!</div>"
10
}
11
12
"locale": "es",
13
"content": "<div>Hola!</div>"
14
}
15
```

Content is saved separately for each separate language that the page supports in the database. When a user visits our site, we show him the translated static HTML.

That's a simplified view of how we serve localized content. There are also caching mechanisms in place to reduce database load. I won't go too much into detail about them in this article, but if you'd like to know more about how we cache HTML content, you can read about it here.

Tips for website localization

Here is some advice that you can hopefully use when localizing your web application:

  • Localization is more than translation: you need to not only translate but adapt and optimize your website by language and region.

  • Regional language differences: the Portuguese language in Brazil is not the same as it is in Portugal, so prepare for that in advance. This is just one example of many languages that differ.

  • Use this URL path localization format: https://your-website.com/{locale-code}/your-page. The locale code should be lowercase, separated by a hyphen (E.g., en, pt-br, zh-tw). This way, it's easy to add new languages as well as manage relative links on the website.

  • Message keys: use actual text as the translation message key, not aliases ("Hello world", not "hello_world_text"). This makes translation keys searchable. It also reduces complexity, as you don't have to maintain the aliases too.

  • Use Unicode (UTF-8) when storing your content and translations. It supports all the languages you'll need. Trust me, you will save yourself a lot of pain in the future.

  • Translate libraries: there are readily available libraries that can help you not just translate words but convert them to plural form if necessary. For instance, PHP has a powerful Translation component that supports the ICU MessageFormat syntax. I recommend you use that instead of inventing your way of translating messages if possible.

  • Use an API to store and get translations. This will allow you to reuse translations between projects, import/export translations, and automate much of the translation process.


Share this listing