Multi-lingual support for task bots
The accuracy of a bot trained with Swiftmatch and mindmeld depends on various features that are involved in the training of the bot and the inference of incoming user messages. The key features that decide whether a language is supported and its accuracy are:
- Text (support for the language and the script it is written in)
- Contextual spellcheck (if spelling mistakes in the data or user query can be corrected)
- Wordforms (handling various forms and synonyms for words present in training data)
For task bot use-cases requiring the usage of system entities, support for system entities in the desired language is also required.
Note
For entity recognition in task bots:
- Custom list, regex, and free-form entity types are supported in all the languages listed below.
Languages supported by task bots
Please find below the list of supported languages LASER (Language encoder).
- Arabic
- Bulgarian
- Catalan
- Chinese
- Croatian
- Danish
- Dutch
- English
- Finnish
- French
- Georgian
- German
- Greek
- Hebrew
- Hindi
- Hungarian
- Indonesian
- Irish
- Italian
- Japanese
- Korean
- Lithuanian
- Macedonian
- Norwegian Bokmål
- Polish
- Portuguese
- Romanian
- Russian
- Spanish
- Swedish
- Turkish
- Ukrainian
- Vietnamese
Find below the list of languages supported by Polymatch -
- Arabic
- Bulgarian
- Catalan
- Croatian
- Danish
- Dutch
- English
- Finnish
- French
- Georgian
- German
- Greek
- Hebrew
- Hindi
- Hungarian
- Indonesian
- Irish
- Italian
- Korean
- Lithuanian
- Macedonian
- Mongolian (mn)
- Norwegian Bokmål
- Polish
- Portuguese
- Romanian
- Russian
- Spanish
- Swedish
- Turkish
- Ukrainian
- Vietnamese
Languages vs entities supported
Ordinal | Quantity | Cardinal | Money | Duration | Date/time | Person | Location | |
---|---|---|---|---|---|---|---|---|
Arabic | Supported | Supported | Supported | Supported | Supported | Supported | ||
Bulgarian | Supported | Supported | Supported | Supported | Supported | Supported | ||
Catalan | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported |
Chinese | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported |
Croatian | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported |
Danish | Supported | Supported | Supported | Supported | Supported | Supported | ||
Dutch | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported |
English | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported |
Finnish | Supported | Supported | Supported | Supported | Supported | Supported | ||
French | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported |
Georgian | Supported | Supported | Supported | Supported | Supported | |||
German | Supported | Supported | Supported | Supported | Supported | Supported | Supported | |
Greek | Supported | Supported | Supported | Supported | Supported | Supported | ||
Hebrew | Supported | Supported | Supported | Supported | Supported | |||
Hindi | Supported | Supported | Supported | Supported | Supported | |||
Hungarian | Supported | Supported | Supported | Supported | ||||
Indonesian | Supported | Supported | Supported | |||||
Irish | Supported | Supported | Supported | Supported | Supported | Supported | ||
Italian | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported |
Japanese | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported |
Korean | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported |
Lithuanian | Supported | Supported | Supported | |||||
Macedonian | Supported | Supported | Supported | Supported | Supported | Supported | ||
Mongolian | Supported | Supported | Supported | Supported | Supported | Supported | ||
Norwegian Bokmål | Supported | Supported | Supported | Supported | Supported | Supported | Supported | |
Polish | Supported | Supported | Supported | Supported | Supported | Supported | ||
Portuguese | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported |
Romanian | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported |
Russian | Supported | Supported | Supported | Supported | Supported | Supported | ||
Spanish | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported |
Swedish | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported |
Turkish | Supported | Supported | Supported | Supported | Supported | Supported | ||
Ukrainian | Supported | Supported | Supported | Supported | Supported | Supported | ||
Vietnamese | Supported | Supported | Supported | Supported |
Languages for text preprocessing
Language | Text | Spellcheck | Wordforms |
---|---|---|---|
English | ✓ | ✓ | ✓ |
Spanish | ✓ | ✓ | |
Portuguese | ✓ | ✓ | |
Russian | ✓ | ✓ | |
Turkish | ✓ | ✓ | |
French | ✓ | ✓ | |
German | ✓ | ✓ | |
Italian | ✓ | ✓ | |
Arabic | ✓ | ✓ | |
Polish | ✓ | ✓ | |
Dutch | ✓ | ✓ | |
Danish | ✓ | ✓ | |
Korean | ✓ | ✓ | |
Norwegian | ✓ | ✓ | |
Swedish | ✓ | ✓ | |
Finnish | ✓ | ✓ | |
Ukrainian | ✓ | ||
Hebrew | ✓ | ||
Greek | ✓ | ||
Bulgarian | ✓ | ||
Catalan | ✓ | ||
Croatian | ✓ | ||
Georgian | ✓ | ||
Hungarian | ✓ | ||
Irish | ✓ | ||
Hindi | ✓ | ||
Bengali | ✓ | ||
Punjabi | ✓ | ||
Marathi | ✓ | ||
Telugu | ✓ | ||
Vietnamese | ✓ | ||
Tamil | ✓ | ||
Urdu | ✓ | ||
Javanese | ✓ | ||
Gujarati | ✓ | ||
Persian | ✓ | ||
Bhojpuri | ✓ | ||
Hausa | ✓ | ||
Kannada | ✓ | ||
Indonesian | ✓ | ||
Yoruba | ✓ | ||
Malayalam | ✓ | ||
Odia | ✓ | ||
Maithili | ✓ | ||
Burmese | ✓ | ||
Uzbek | ✓ | ||
Sindhi | ✓ | ||
Romanian | ✓ | ||
Pashto | ✓ | ||
Magahi | ✓ | ||
Malay | ✓ | ||
Nepali | ✓ | ||
Assamese | ✓ | ||
Afrikaans | ✓ | ||
Albanian | ✓ | ||
Amharic | ✓ | ||
Armenian | ✓ | ||
Azerbaijani | ✓ | ||
Basque | ✓ | ||
Belarusian | ✓ | ||
Bosnian | ✓ | ||
Czech | ✓ | ||
Esperanto | ✓ | ||
Estonian | ✓ | ||
Galician | ✓ | ||
Icelandic | ✓ | ||
Kazakh | ✓ | ||
Kurdish | ✓ | ||
Latvian | ✓ | ||
Lithuanian | ✓ | ||
Macedonian | ✓ | ||
Malagasy | ✓ | ||
Serbian | ✓ | ||
Sinhala | ✓ | ||
Slovak | ✓ | ||
Slovenian | ✓ | ||
Somali | ✓ | ||
Swahili | ✓ | ||
Tagalog | ✓ | ||
Tajik | ✓ | ||
Tatar | ✓ |
Features supported by other languages include:
Language | Text | Spellcheck | Common system entities | Wordforms |
---|---|---|---|---|
Chinese | ✓ | ✓ | ✓ | |
Japanese | ✓ | ✓ | ✓ | |
Thai | ✓ | |||
Burmese | ✓ | |||
Khmer | ✓ |
Updated 6 months ago