Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.neolli.jocoding.io/llms.txt

Use this file to discover all available pages before exploring further.

Audio quality is the single biggest factor affecting transcription accuracy and dubbing quality. Here’s how to get the best results.

Getting accurate transcriptions

Do:
  • Use clean audio with minimal background noise
  • Record with a good microphone (lapel or directional)
  • Speak clearly at a moderate pace
  • Select the correct source language (or use auto-detect for common languages)
Avoid:
  • Heavy background music during speech
  • Multiple speakers talking simultaneously
  • Echo-heavy rooms or outdoor environments with wind
  • Very fast speech or heavy accents without post-editing

Speaker diarization

Neolli automatically identifies different speakers in your video. For best results:
  • Speakers should have distinct voices
  • Minimize crosstalk (speakers talking over each other)
  • Longer speaking turns produce more reliable speaker identification
If speaker labels are wrong, you can reassign them in the caption editor using the T shortcut.
Videos with primarily music content (no speech) will fail dubbing with a music_detected error. This is expected — voice dubbing requires spoken content to clone and re-synthesize. Credits are automatically refunded.