How to check the language in which users write?

Whether your site accepts multiple languages or just a single one, you need to check the language for each item that gets submitted.

Implio has a built-in, AI-powered, language detection feature that makes this process fully automated and highly accurate.

Automatic language detection

Implio automatically detects the language in which the title and body of an item have been written in. The result is exposed as a BLANG variable, using ISO 639-1 codes. You are then free to create automation rules that make use of its value.

You can also specify the language that you expect the title and body to be written in using the content.languageExpected API input field (see API documentation). If specified, this value will be taken into consideration when the language is automatically detected, making the language prediction more reliable.

You will also be able to query its value as a corresponding BLANG variable, and use it to compare its value against the detected language:

Name
(simple mode)
Variable name
(advanced mode)
Description
Type
Example
Language detected
$text.languageDetected
Contains the ISO-639-1 code of the language in which the title and the body were detected to be written, or the string "unknown" if no language could be detected.
String
"fr"
Language expected
$text.languageExpected
Contains the ISO-639-1 code of the language in which the title and the body are expected to be written, or <undefined> if no value was specified in the content.languageExpected API input field.
String
"en"

Making use of the detected language

The following automation rule ensures that the detected language matches the expected one:

$text.languageDetected EQUALS $text.languageExpected

This rule handles cases where no language was predicted:

$text.languageDetected EQUALS "unknown"

This rule ensures that the language is either English or French:

$text.languageDetected CONTAINS ("en", "fr")

If you have a longer list of supported languages, you can add them to a list:

$text.languageDetected CONTAINS @supportedLanguages

You may also combine this variable with other ones to create more complex conditions.

Supported languages

Automatic language detection currently supports the following 123 languages:

Language
ISO 639-1 code
Afrikaans af
Albanian sq
Amharic am
Arabic ar
Aragonese an
Armenian hy
Assamese as
Avaric av
Azerbaijani az
Bashkir ba
Basque eu
Belarusian be
Bengali bn
Bihari languages bh
Bosnian bs
Breton br
Bulgarian bg
Burmese my
Catalan ca
Central Khmer km
Chechen ce
Chinese zh
Chuvash cv
Cornish kw
Corsican co
Croatian hr
Czech cs
Danish da
Divehi dv
Dutch nl
English en
Esperanto eo
Estonian et
Finnish fi
French fr
Gaelic gd
Galician gl
Georgian ka
German de
Greek el
Guarani gn
Gujarati gu
Haitian ht
Hebrew he
Hindi hi
Hungarian hu
Icelandic is
Ido io
Indonesian id
Interlingua ia
Interlingue ie
Irish ga
Italian it
Japenese ja
Javanese jv
Kannada kn
Kazakh kk
Kirghiz ky
Komi kv
Korean ko
Kurdish ku
Lao lo
Latin la
Latvian lv
Limburgan li
Lithuanian lt
Luxembourgish lb
Macedonian mk
Malagasy mg
Malay ms
Malayalam ml
Maltese mt
Manx gv
Marathi mr
Mongolian mn
Nepali ne
Norwegian no
Norwegian Nynorsk nn
Occitan oc
Oriya or
Ossetian os
Panjabi pa
Pashto ps
Persian fa
Polish pl
Portuguese pt
Quechua qu
Romanian ro
Romansh rm
Russian ru
Sanskrit sa
Sardinian sc
Serbian sr
Serbo-Croatian sh
Sindhi sd
Sinhala si
Slovak sk
Slovenian sl
Somali so
Spanish es
Sundanese su
Swahili sw
Swedish sv
Tagalog tl
Tajik tg
Tamil ta
Tatar tt
Telugu te
Thai th
Tibetan bo
Turkish tr
Turkmen tk
Uighur ug
Ukranian uk
Urdu ur
Uzbek uz
Vietnamese vi
Volapük vo
Wallon wa
Welsh cy
Western Frisian fy
Yiddish yi
Yoruba yo

Known limitations

Automatic language detection uses state-of-the-art machine learning for accurate predictions. But as with any predictive models, it may sometime mistake one language for another, especially when those languages are very similar (e.g. Russian and Ukranian).
One way to avoid some of those mistakes is to specify the expected language (see above).

Also, a language may not be always be predicted, for instance when the text is very short. You can choose how to handle those cases in your automation rules by testing the $text.languageDetected for the "unknown" string (see example above).


Was this article helpful?

Can’t find what you’re looking for?

We are here to support you.