Skip to content

SSML and language detection

When voice_lang is null, the system automatically detects the voice language from message content using the patrickschur/language-detection library.

Supported languages: English (EN), German (DE), French (FR), Italian (IT), Spanish (ES), Polish (PL), Dutch (NL), Romanian (RO), Portuguese (PT), Czech (CS), Hungarian (HU), Swedish (SV), Danish (DA), Finnish (FI), Slovak (SK), Croatian (HR), Turkish (TR), Russian (RU), Bulgarian (BG), Ukrainian (UK).

The API now supports a subset of SSML (Speech Synthesis Markup Language) for advanced voice message formatting.

  • <break> - Insert pauses in speech
  • <say-as> - Control pronunciation of text
  • Purpose: Add pauses between words or sentences
  • Attributes: time (e.g., “1s”, “500ms”)
  • Pricing: <0.5ms = 1 char, <1s = 2 chars, <1,5s = 3 chars, etc.
  • Purpose: Control how text is interpreted and spoken
  • Supported Attributes:
    • interpret-as="characters" - Spell out text character by character
  • Note: Currently only “characters” value is supported

Wrap your message content with the <speak> root element:

<speak>
Your verification code is <say-as interpret-as="characters">A1B2C3</say-as>.
<break time="1s"/>
Please enter it now.
</speak>