Skip to main content

Overview

This guide walks you through the complete FineVoice API workflow: from getting your API key to generating speech, converting voices, creating sound effects, and separating audio tracks. All audio processing tasks follow the same async pattern — submit a request, get a task_id, then poll for the result.

Get your API Key

  1. Open FineVoice and click Sign up in the top-right corner.
  2. Choose a sign-up method: Google, Apple, or Email.
  3. After logging in, navigate to the User Center.
Keep your API key secret. Never commit it to version control or expose it in client-side code.
  1. Go to https://finevoice.ai/usercenter
  2. Navigate to API Tokens
  3. Click Generate Secret Key and copy the key
Store it as an environment variable for all examples below:
export FINEVOICE_API_KEY="your_api_key_here"
Windows Command Prompt:
set FINEVOICE_API_KEY=your_api_key_here

Async Task Pattern

All audio processing endpoints work the same way:
1

Submit the request

Send a POST request with your audio task parameters. The API immediately returns a task_id.
{ "task_id": "p1-a1b2c3d4-e5f6-7890-abcd-ef1234567890" }
2

Poll for the result

Use GET /v1/task/{task_id} to check status. Poll every 2–3 seconds until status is completed.
{
  "task_id": "p1-a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "status": "completed",
  "url": "https://dlfile.fineshare.net/audio/a1b2c3d4.mp3",
  "error": null
}
StatusMeaning
pendingTask queued, not yet started
processingTask is being processed
completedTask finished — url contains the download link
failedTask failed — error contains the reason
3

Download the output

curl -L -o output.mp3 "https://dlfile.fineshare.net/output/a1b2c3d4.mp3"

1. Text to Speech

Convert text into natural-sounding speech. Supports 1,500+ AI voices and emotion tags like [happy], [sad], [breathe].
1

Submit the TTS request

curl -X POST https://apis.finevoice.ai/v1/audio/speech-synthesis \
  -H "Authorization: Bearer $FINEVOICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "voice": "james",
    "text": "[happy] Hi, welcome to FineVoice! [breathe] Let me show you what I can do."
  }'
Response:
{ "task_id": "p1-a1b2c3d4-e5f6-7890-abcd-ef1234567890" }
2

Poll for result

curl -X GET https://apis.finevoice.ai/v1/task/p1-a1b2c3d4-e5f6-7890-abcd-ef1234567890 \
  -H "Authorization: Bearer $FINEVOICE_API_KEY"
Response when completed:
{
  "task_id": "p1-a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "status": "completed",
  "url": "https://dlfile.fineshare.net/output/a1b2c3d4.mp3",
  "error": null
}
3

Download the audio

curl -L -o tts_output.mp3 "https://dlfile.fineshare.net/output/a1b2c3d4.mp3"
Use the List Voices API to browse all available voice models and find the right voice name for your project.

2. Voice Conversion

Transform the voice in an existing audio file to a different AI voice while preserving the original content and timing.
1

Submit the conversion request

curl -X POST https://apis.finevoice.ai/v1/audio/voice-conversion \
  -H "Authorization: Bearer $FINEVOICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "voice": "madison",
    "sourceUrl": "https://dlaudio.fineshare.net/cover/speak/30f23d17-634d-420e-99e7-d24097dc669b.mp3",
    "outputFormat": "mp3",
    "useAsync": true
  }'
Response:
{ "task_id": "p1-b2c3d4e5-f6a7-8901-bcde-f12345678901" }
2

Poll for result

curl -X GET https://apis.finevoice.ai/v1/task/p1-b2c3d4e5-f6a7-8901-bcde-f12345678901 \
  -H "Authorization: Bearer $FINEVOICE_API_KEY"
3

Download converted audio

curl -L -o converted.mp3 "https://dlfile.fineshare.net/output/b2c3d4e5.mp3"

3. Sound Effect Generation

Generate royalty-free sound effects from a text description. Perfect for videos, games, and podcasts.
1

Submit the SFX request

curl -X POST https://apis.finevoice.ai/v1/audio/sfx-generation \
  -H "Authorization: Bearer $FINEVOICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Thunderstorm with heavy rain and distant thunder",
    "negative_prompt": "music, voices",
    "duration": 5.0,
    "useAsync": true
  }'
Response:
{ "task_id": "p1-c3d4e5f6-a7b8-9012-cdef-123456789012" }
2

Poll and download

curl -X GET https://apis.finevoice.ai/v1/task/p1-c3d4e5f6-a7b8-9012-cdef-123456789012 \
  -H "Authorization: Bearer $FINEVOICE_API_KEY"

curl -L -o thunderstorm.mp3 "https://dlfile.fineshare.net/output/c3d4e5f6.mp3"
You can also generate effects directly from a video by providing sourceUrl and sourceType:
{
  "sourceUrl": "https://example.com/video/clip.mp4",
  "sourceType": "video",
  "duration": 10.0,
  "useAsync": true
}

4. Audio Separation

Separate vocals from background music in any audio file. Ideal for remixing, karaoke creation, or vocal extraction.
1

Submit the separation request

curl -X POST https://apis.finevoice.ai/v1/audio/separation \
  -H "Authorization: Bearer $FINEVOICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "sourceUrl": "https://webresources.fineshare.net/finevoice3/audio/isolator-original.mp3",
    "model": "vocal-remover",
    "useAsync": true
  }'
Response:
{ "task_id": "p1-d4e5f6a7-b8c9-0123-defa-234567890123" }
2

Poll and download

curl -X GET https://apis.finevoice.ai/v1/task/p1-d4e5f6a7-b8c9-0123-defa-234567890123 \
  -H "Authorization: Bearer $FINEVOICE_API_KEY"

curl -L -o vocals.mp3 "https://dlfile.fineshare.net/output/d4e5f6a7.mp3"

5. Speech to Text

Transcribe speech from an audio or video URL. Supports optional speaker diarization and word-level timestamps.
1

Submit the STT request

curl -X POST https://apis.finevoice.ai/v1/audio/stt \
  -H "Authorization: Bearer $FINEVOICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/interview.mp3",
    "language": "en",
    "format": "json",
    "speaker_diarization": true,
    "max_speakers": 2,
    "useAsync": true
  }'
Response:
{ "task_id": "p1-e5f6a7b8-c9d0-1234-efab-345678901234" }
2

Poll for result

curl -X GET https://apis.finevoice.ai/v1/task/p1-e5f6a7b8-c9d0-1234-efab-345678901234 \
  -H "Authorization: Bearer $FINEVOICE_API_KEY"

6. Voice Cloning

Train a custom AI voice model from a short audio recording. Once trained, the voice name can be used in any TTS or Voice Conversion request.
curl -X POST https://apis.finevoice.ai/v1/voice/train \
  -H "Authorization: Bearer $FINEVOICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "My Custom Voice",
    "languageCode": "en-US",
    "gender": "female",
    "audioUrl": "https://example.com/my_voice_sample.wav"
  }'
For best results, use a clean 30–120 second recording with no background noise. After training completes, use the voice name you provided in any TTS or Voice Conversion request.

7. Music Generation

By Prompt

Generate a music track from a text description.
curl -X POST https://apis.finevoice.ai/v1/music/musicgenbyprompt \
  -H "Authorization: Bearer $FINEVOICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Upbeat electronic dance music with heavy bass and synth leads",
    "instrumental": true,
    "duration": 30
  }'

With Lyrics

Generate a full song with vocals using your own lyrics and style description.
curl -X POST https://apis.finevoice.ai/v1/music/musicgen \
  -H "Authorization: Bearer $FINEVOICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "title": "Morning Light",
    "style": "pop, warm, acoustic guitar",
    "lyrics": "[Verse 1]\nWake up to the morning light\nEverything is gonna be alright",
    "instrumental": false,
    "modelVersion": "v3"
  }'

Music Cover

Replace the vocals of an existing song with an AI voice.
curl -X POST https://apis.finevoice.ai/v1/music/cover \
  -H "Authorization: Bearer $FINEVOICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "voice": "james",
    "sourceUrl": "https://example.com/original_song.mp3",
    "engine": "v5",
    "pitch": 0,
    "outputFormat": "mp3"
  }'

8. Audio Enhancement

Quick Enhancement

Reduce background noise from a single audio file.
curl -X POST https://apis.finevoice.ai/v1/enhancer/speech_enhancement \
  -H "Authorization: Bearer $FINEVOICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/noisy_recording.mp3",
    "model": "MossFormer2_SE_48K",
    "output_format": "mp3"
  }'

All-in-One Pipeline

Run multiple enhancement steps in a single request — noise reduction, filler word removal, silence trimming, and loudness normalization.
curl -X POST https://apis.finevoice.ai/v1/enhancer/process/pipeline \
  -H "Authorization: Bearer $FINEVOICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/podcast_raw.mp3",
    "step_speech_enhancement": true,
    "step_remove_long_silences": true,
    "step_filler_words_remove": true,
    "step_audio_normalization": true,
    "filler_use_whisper": true,
    "filler_language": "en",
    "norm_method": "peak",
    "output_format": "mp3"
  }'
The Pipeline processes steps in a fixed order: Speech Enhancement → Remove Mouth Sounds → Remove Long Silences → Super Resolution → Filler Words Removal → Stuttering Removal → Audio Normalization. Enable only the steps you need.

9. Podcast Generation

Podcast Generation

Generate a multi-speaker AI podcast from a prompt or script.
curl -X POST https://apis.finevoice.ai/v1/audio/podcastgen \
  -H "Authorization: Bearer $FINEVOICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "A friendly 3-minute discussion about the future of AI voice technology",
    "speakers": ["olivia", "ethan"],
    "style": "conversational",
    "expectedDuration": "3min",
    "useAsync": true
  }'

Support

Need help? Check out these resources: