Audio Settings
Audio function configuration guide, including Speech-to-Text (STT) and Text-to-Speech (TTS) features.
Speech-to-Text (STT)
The Speech-to-Text feature converts your voice into text, supporting voice-based recording and input.
Configuration
- STT Model: Select the AI model for speech recognition. You need to pre-configure a voice-type model in the model configuration
- Language: Select the language type for recognition, supporting Chinese, English, and other languages
Use Cases
- Quickly record ideas via voice on the recording page
- Replace keyboard input to improve recording efficiency
- Suitable for mobile use
Text-to-Speech (TTS)
The Text-to-Speech feature converts text into spoken audio, with adjustable speech speed.
Configuration
- TTS Model: Select the AI model for speech synthesis. You need to pre-configure a voice-type model in the model configuration
- Speech Rate: Adjust the speech speed, ranging from 0.5x (slow) to 2.0x (fast)
- 0.5x - Suitable for learning and comprehension
- 1.0x - Normal speed (default)
- 1.5x - Fast browsing
- 2.0x - Speed reading
Use Cases
- Read aloud note content for review
- Audio playback for long documents
- Free your eyes and receive information through listening
- Suitable for commuting, exercise, and other scenarios
Notes
- Model Configuration: Before using audio features, ensure you have added voice-type models in the model configuration
- Network Requirements: Speech recognition and synthesis require a network connection. Please ensure stable network access
- Quality Optimization: Selecting an appropriate model can achieve better recognition accuracy and speech naturalness
- Speed Recommendation: For first-time use, start with normal speed and adjust as needed
Quick Actions
- After selecting text on the writing page, you can use the text-to-speech feature
- On the recording page, click the voice button to start voice input directly