How does the Text to Speech tool generate spoken audio from my text?
The tool uses the Web Speech API's SpeechSynthesis interface, which is built into modern browsers. It sends your text to the operating system's native TTS engine, which generates the spoken output using installed voices. Because the synthesis happens locally through the browser and OS, no text is uploaded to a server. You can choose voices, adjust the speaking rate, and tune the pitch to match the tone you want.
Why are different voices available on different devices and browsers?
Voices are provided by your operating system, not the browser itself. Windows, macOS, iOS, Android, and ChromeOS each ship with their own set of system voices, and some browsers like Chrome add cloud-based voices on top. As a result, the dropdown of available voices will look different on a Mac compared with a Windows PC or an Android phone. You can install additional language packs through your OS settings to expand the list.
What do the rate and pitch sliders actually control?
The rate slider controls how fast the voice speaks, where 1.0 is the normal speed, lower values slow it down, and higher values speed it up. The pitch slider raises or lowers the perceived tone of the voice, where 1.0 is the default. Adjusting pitch can make a voice sound deeper or higher without changing speed. Combine both sliders to create variations suitable for narration, accessibility readouts, or playful content.
Which browsers support the Web Speech API for text to speech?
SpeechSynthesis is supported in Chrome, Edge, Safari, Firefox, and Opera on desktop, and in Safari on iOS, Chrome on Android, and Samsung Internet. Coverage is broad, but voice quality and selection vary widely. Chrome on desktop typically offers the richest voice list because it includes Google's online voices. Older browsers and some embedded webviews do not support the API, in which case the tool will display a fallback message.
Can I download the generated speech as an audio file?
The Web Speech API does not provide a built-in way to capture synthesized output as a downloadable file because audio is sent directly to the OS audio device. Some browsers block recording it through the page itself for privacy reasons. If you need an MP3 or WAV of the speech, use a screen recorder, a virtual audio cable, or a dedicated TTS service that returns audio files. The browser tool is best for live playback.
Is my text sent to a server when I click Speak?
In most cases, no. The browser hands the text to the local OS speech engine, so it stays on your device. However, Chrome on desktop offers some Google network voices that may transmit text to Google's servers for processing. If privacy is critical, choose a voice labeled as a system or local voice rather than a network voice, and check your browser's privacy notice for details about which engines run online.