Audio Settings

Audio function configuration guide, including Speech-to-Text (STT) and Text-to-Speech (TTS) features.

The Speech-to-Text feature converts your voice into text, supporting voice-based recording and input.

STT Model: Select the AI model for speech recognition. You need to pre-configure a voice-type model in the model configuration
Language: Select the language type for recognition, supporting Chinese, English, and other languages

The Text-to-Speech feature converts text into spoken audio, with adjustable speech speed.

TTS Model: Select the AI model for speech synthesis. You need to pre-configure a voice-type model in the model configuration
Speech Rate: Adjust the speech speed, ranging from 0.5x (slow) to 2.0x (fast)
- 0.5x - Suitable for learning and comprehension
- 1.0x - Normal speed (default)
- 1.5x - Fast browsing
- 2.0x - Speed reading

Model Configuration: Before using audio features, ensure you have added voice-type models in the model configuration
Network Requirements: Speech recognition and synthesis require a network connection. Please ensure stable network access
Quality Optimization: Selecting an appropriate model can achieve better recognition accuracy and speech naturalness
Speed Recommendation: For first-time use, start with normal speed and adjust as needed

After selecting text on the writing page, you can use the text-to-speech feature
On the recording page, click the voice button to start voice input directly