How to detect toxic language using Text Vision

Quick Navigation

Support Home Automation How to detect toxic language using Text Vision

Whether you are running a dating site, classifieds site or some other type of marketplace, toxic language is likely something that you encounter and need to deal with to keep your site clean and protect your users.

This article explains how to make use of Implio's built-in Text Vision filters to detect and block out-of-line contents before they hit your site.

Target audience

Toxic language filters can be useful for the following types of marketplaces and content:

Type of site	Classifieds site Dating site News site Sharing economy site and other types of marketplaces
Type of content	Classified ad Profile description Comment User review One-to-one message

Supported fields and languages

Toxic language filters operate on the following API input fields:

content.title
content.body

The following languages are currently supported:

Language	ISO 639-1 code
English	en
French	fr
Spanish	es

Not seeing the language you are looking for? Reach out to our support team to know more about upcoming languages!

How to use toxic language filters

Before you start

Implio's built-in Text Vision filters leverage automatic language detection, as they operate on specific languages.
For optimal results, make sure you set the content.languageExpected API input field to make the language detection more reliable. See How to check the language in which users write for more information.

BLANG variables

Toxic language filters are exposed as several BLANG variables, each corresponding to a different kind of toxic language.
Each variable contains the number of terms that were found in the text:

Variable	Description	Type	Possible values
$text.blasphemyCount	Number of blasphemy terms detected in the text	Integer	Number of terms detected, 0 if no term matched or if the language isn't supported
$text.sexualTermCount	Number of sexual terms detected in the text	Integer	Number of terms detected, 0 if no term matched or if the language isn't supported
$text.badWordCount	Number of bad words detected in the text	Integer	Number of terms detected, 0 if no term matched or if the language isn't supported
$text.violenceTermCount	Number of terms related to violence detected in the text	Integer	Number of terms detected, 0 if no term matched or if the language isn't supported
$text.extremismTermCount	Number of terms related to extremism detected in the text	Integer	Number of terms detected, 0 if no term matched or if the language isn't supported
$text.racismTermCount	Number of terms related to racism detected in the text	Integer	Number of terms detected, 0 if no term matched or if the language isn't supported

It is worth noting that each term or expression may only be counted once by the above variables. In other words, there is no overlap between the different filters.

Additionally, this variable contains the sum of all the above-listed variables:

Variable	Description	Type	Possible values
$text.toxicTermCount	Total number of toxic terms detected in the text	Integer	Number of terms detected, 0 if no term matched or if the language isn't supported

Filtering toxic language using rules

This BLANG condition will pick up any occurrence of toxic language in the text:

$text.toxicTermCount>0

which is strictly equivalent to:

$text.blasphemyCount>0 OR $text.badWordCount>0 OR $text.sexualTermCount>0 OR $text.violenceTermCount>0 OR $text.extremismTermCount>0 OR $text.racismTermCount>0

You may choose to remove some of the variables from the condition, depending on what type of toxic language you wish to filter out (blasphemy for instance may be considered as acceptable), or split the condition into multiple ones with different actions.

Setting the rule's action

Mild profanity or other kinds of toxic language can sometimes be acceptable depending on the context in which they are used, and how much you tolerate on your site.
For instance, use of common words like 'crap' may not be reason enough to refuse a piece of content.

For this reason, it is preferable to set the corresponding rule's action to Send to manual rather than Refuse, so that the content can be reviewed by a moderator.

Finally, you may decide to refuse contents that contain multiple occurrences of toxic language. You can do so by adding a rule such as:

$text.toxicTermCount>=3

and setting its action to Refuse.

Known limitations

Our Text Vision filters have been meticulously crafted by our team of linguists and data scientists and tested against large corpora of user-generated content.

However, they may sometime bring false positives. Conversely, they may be missing some terms or expressions.

We update Text Vision filters regularly. We value and welcome your feedback to help us improve Implio.

Sorry, we didn't find any relevant articles for you.

How to detect toxic language using Text Vision

Quick Navigation

Target audience

Supported fields and languages

How to use toxic language filters

Before you start

BLANG variables

Filtering toxic language using rules

Setting the rule's action

Known limitations

Was this article helpful?

Can’t find what you’re looking for?

Sorry, we didn't find any relevant articles for you.

How to detect toxic language using Text Vision

Quick Navigation

Target audience

Supported fields and languages

How to use toxic language filters

Before you start

BLANG variables

Filtering toxic language using rules

Setting the rule's action

Known limitations

Was this article helpful?

Related Questions:

Can’t find what you’re looking for?