API Key for API endpoints
The text to be converted to speech.
"I am solving the equation x = (-b +/- sqrt(b^2-4ac)) / 2a. Nobody can, it is a disaster and very sad."
Optional prompt to guide the style of generated speech. Ignored when speaker embedding is provided.
"Very happy."
Preset voice for synthesis. Ignored when speaker embedding is provided.
Vivian, Serena, Uncle_Fu, Dylan, Eric, Ryan, Aiden, Ono_Anna, Sohee "Vivian"
The language of the voice.
Auto, English, Chinese, Spanish, French, German, Italian, Japanese, Korean, Portuguese, Russian "English"
URL to a speaker embedding safetensors file from fal-ai/qwen-3-tts/clone-voice. If provided, cloned voice is used instead of predefined voices.
"https://storage.googleapis.com/falserverless/example_outputs/qwen3-tts/clone_out.safetensors"
Optional reference text used when creating the speaker embedding. This can improve quality when using cloned voice.
"Okay. Yeah. I resent you. I love you. I respect you. But you know what? You blew it!"
Top-k sampling parameter.
x >= 0Top-p sampling parameter.
0 <= x <= 1Sampling temperature; higher means more random output.
0 <= x <= 1Penalty to reduce repeated tokens/codes.
x >= 0Sampling switch for the sub-talker.
Top-k sampling for the sub-talker.
x >= 0Top-p for sub-talker sampling.
0 <= x <= 1Temperature for sub-talker sampling.
0 <= x <= 1Maximum number of new codec tokens to generate.
1 <= x <= 8192