Audio Terms Explained

The process of separating a mixed audio track into individual components (stems) such as vocals, drums, bass, and other instruments. Modern AI-powered stem splitters can isolate dozens of different elements from a single song, enabling remixes, karaoke creation, and sampling.

Vocal Removal

Try Vocal Remover

A specific type of stem splitting that isolates and removes vocals from a song, leaving only the instrumental backing track. Commonly used to create karaoke versions or instrumental covers.

Noise Reduction

Try Noise Reduction

The process of removing unwanted background sounds from audio recordings, including hum, hiss, wind noise, echo, and ambient sounds. AI-powered noise reduction can intelligently preserve voice quality while eliminating distractions.

Audio Enhancement

Improving the overall quality of audio through techniques like EQ adjustment, compression, normalization, and clarity boosting. Often combined with noise reduction for podcast and video production.

Voice AI

Voice Cloning

Try Voice Cloning

AI technology that creates a digital replica of a person's voice from a short audio sample (typically 5-30 seconds). The cloned voice can then speak any text while maintaining the original speaker's unique vocal characteristics, tone, and style.

Text-to-Speech (TTS)

Try Text to Speech

Technology that converts written text into spoken audio. Modern TTS systems use AI to produce natural-sounding speech with proper intonation, emotion, and pacing. Supports 200+ languages for global content creation.

Speech-to-Text (STT)

Try Speech to Text

Also known as automatic speech recognition (ASR), this converts spoken audio into written text. Used for transcription, subtitles, and searchable audio content.

Voice Changer

Try Voice Changer

Software that modifies the characteristics of a voice recording, such as pitch, tone, or timbre. Can be used for anonymization, character voices, or creative effects.

Music Production

AI Music Generation

Try AI Music Generator

Creating original music using artificial intelligence. Users describe what they want (genre, mood, instruments) and AI composes complete songs with melody, harmony, rhythm, and sometimes vocals and lyrics.

Instrumental

A version of a song without vocals, featuring only the musical instruments. Created either through stem splitting (removing vocals from existing songs) or AI generation.

Acapella

The isolated vocal track from a song, without any instrumental backing. Extracted using stem splitting technology for remixing, sampling, or vocal analysis.

BPM (Beats Per Minute)

A measurement of tempo in music, indicating how many beats occur in one minute. Essential for DJs, music producers, and anyone syncing audio to video or other tracks.

Sample

A portion of a sound recording reused in another recording. Stem splitting enables clean sample extraction from existing songs for use in new productions.

Transcription

Speaker Diarization

The process of automatically identifying and labeling different speakers in an audio recording. Essential for podcast transcription, meeting notes, and interview processing. Answers 'who spoke when' in multi-speaker recordings.

Speaker Separation

Going beyond diarization, speaker separation actually isolates each speaker's audio onto separate tracks. This allows individual editing, volume adjustment, and processing of each voice in a conversation.

Transcription

Try Speech to Text

The process of converting audio or video content into written text. Includes word-level timestamps, speaker identification, and punctuation. Used for accessibility, SEO, content repurposing, and searchability.

General Audio

Bitrate

The amount of data processed per second in an audio file, measured in kbps (kilobits per second). Higher bitrates generally mean better audio quality but larger file sizes. Common values: 128kbps (acceptable), 256kbps (good), 320kbps (high quality).

Sample Rate

The number of samples of audio recorded per second, measured in Hz or kHz. Standard rates include 44.1kHz (CD quality), 48kHz (video standard), and 96kHz (professional audio).

WAV vs MP3

WAV is an uncompressed audio format that preserves full quality but creates large files. MP3 is a compressed format that reduces file size by removing audio data humans can barely hear. Use WAV for editing and archiving; MP3 for sharing and streaming.

FLAC

Audio Processing Voice AI Music Production Transcription General Audio

Free Lossless Audio Codec — a compressed audio format that preserves full quality (unlike MP3). Offers the best of both worlds: smaller file sizes than WAV with no quality loss. Ideal for archiving and audiophile listening.

Latency

The delay between input and output in audio processing. Low latency is crucial for real-time applications like live streaming, voice changers, and interactive voice AI.

Ready to try these tools?

AudioPod brings all these audio capabilities together in one platform. Start free, no credit card required.

Start Creating Free

Audio Glossary

Audio Terms Explained

Clear, jargon-free definitions of audio production terms. Whether you're a podcaster, musician, or content creator, understand the tools and techniques that power modern audio.

Audio Processing

Stem Splitting

Vocal Removal

Try Vocal Remover

A specific type of stem splitting that isolates and removes vocals from a song, leaving only the instrumental backing track. Commonly used to create karaoke versions or instrumental covers.

Noise Reduction

Try Noise Reduction

Audio Enhancement

Improving the overall quality of audio through techniques like EQ adjustment, compression, normalization, and clarity boosting. Often combined with noise reduction for podcast and video production.

Voice AI

Also known as automatic speech recognition (ASR), this converts spoken audio into written text. Used for transcription, subtitles, and searchable audio content.

Voice Changer

Try Voice Changer

Software that modifies the characteristics of a voice recording, such as pitch, tone, or timbre. Can be used for anonymization, character voices, or creative effects.

Music Production

AI Music Generation

Try AI Music Generator

Instrumental

A version of a song without vocals, featuring only the musical instruments. Created either through stem splitting (removing vocals from existing songs) or AI generation.

Acapella

The isolated vocal track from a song, without any instrumental backing. Extracted using stem splitting technology for remixing, sampling, or vocal analysis.

BPM (Beats Per Minute)

A measurement of tempo in music, indicating how many beats occur in one minute. Essential for DJs, music producers, and anyone syncing audio to video or other tracks.

Sample

A portion of a sound recording reused in another recording. Stem splitting enables clean sample extraction from existing songs for use in new productions.

Transcription

Speaker Diarization

Speaker Separation

Transcription

Try Speech to Text

General Audio

Bitrate

Sample Rate

The number of samples of audio recorded per second, measured in Hz or kHz. Standard rates include 44.1kHz (CD quality), 48kHz (video standard), and 96kHz (professional audio).

WAV vs MP3

FLAC