Documentation Index
Fetch the complete documentation index at: https://docs.neolli.jocoding.io/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Dubbing generates a new audio track for your video in the target language, using a voice that sounds like the original speaker. This goes beyond subtitles — viewers hear the content in their language with the creator’s own voice characteristics.Supported dubbing languages
Voice cloning is supported for 10 languages:| Flag | Language | Code |
|---|---|---|
| 🇺🇸 | English | eng |
| 🇨🇳 | Chinese | zho |
| 🇰🇷 | Korean | kor |
| 🇮🇹 | Italian | ita |
| 🇪🇸 | Spanish | spa |
| 🇧🇷 | Portuguese | por |
| 🇩🇪 | German | deu |
| 🇫🇷 | French | fra |
| 🇯🇵 | Japanese | jpn |
| 🇷🇺 | Russian | rus |
How voice cloning works
The voice cloning process works in three steps:- Profile — Creates a voice profile from the original speaker’s audio
- Synthesize — Generates speech in the target language using the cloned voice
- Calibrate — Adjusts audio duration to match the original segment timing
Starting a dubbing job
- From Add Languages, select your target languages
- Enable the Dubbing toggle alongside (or instead of) captions
- Click Start
Reviewing dubbed audio
Once complete, open the video workspace and select the dubbed language track. The video player plays the dubbed audio synced with the video so you can review before publishing.Downloading dubbed audio
Click the download icon on any completed language card to access:
- Merged Audio (MP3) — Dubbed speech mixed with the instrumental background track
- Dubbed Audio (MP3) — Dubbed speech only, no background audio
- Instrumental (WAV) — Background audio only (extracted from original)
- Audio Segments (ZIP) — Individual WAV files for each segment
Tips for best results
- Single clear speaker — Videos with one speaker produce the best cloning results
- Minimal background noise — Heavy music or ambient noise degrades voice clone quality
- Sufficient reference audio — Videos under 30 seconds may not provide enough audio for accurate cloning
- Edit dubbed captions — If synthesis sounds unnatural, try editing the translated caption text to be shorter or simpler — the dubbed audio will regenerate
Music content detection
Neolli automatically analyzes audio to detect whether a video contains primarily music rather than speech. This check runs during the first dubbing attempt for each video. What happens when music is detected:- The dubbing job fails immediately with a
music_detectederror (credits are automatically refunded) - The video is permanently flagged as music content
- All future dubbing requests for that video are blocked — both from the localization modal and per-language “Generate Dubbing” buttons
- Transcription, translation, and caption generation remain available