Skip to main content
POST
/
v1
/
enhancer
/
speech_enhancement
Speech Enhancement
curl --request POST \
  --url https://apis.finevoice.ai/v1/enhancer/speech_enhancement \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "url": "https://example.com/audio.mp3",
  "model": "MossFormer2_SE_48K",
  "use_vad": false,
  "enable_normalization": false,
  "normalization_method": "peak",
  "normalization_peak_db": -1,
  "normalization_rms_db": -20,
  "output_format": "wav"
}
'
{}

Authorizations

Authorization
string
header
required

Bearer token (API key). Format: Bearer {your_api_key}

Body

application/json

The speech enhancement request payload.

url
string

Audio file URL (http/https). Supports WAV/MP3/FLAC/AAC/OGG/OPUS/M4A/WEBM.

Example:

"https://example.com/audio.mp3"

model
string
default:MossFormer2_SE_48K

Enhancement model. Supported: MossFormer2_SE_48K, FRCRN_SE_16K, MossFormerGAN_SE_16K.

Example:

"MossFormer2_SE_48K"

use_vad
boolean
default:false

Enable VAD (Voice Activity Detection) preprocessing.

enable_normalization
boolean
default:false

Enable audio normalization after enhancement.

normalization_method
string
default:peak

Normalization method: peak, rms, or both.

normalization_peak_db
number
default:-1

Target peak level in dBFS.

normalization_rms_db
number
default:-20

Target RMS level in dBFS.

output_format
string
default:wav

Output format: wav, mp3, flac, or m4a.

Response

Enhanced audio returned.

The response is of type object.