WAV to MP3 Technology Explained

Understanding Audio Formats

WAV Format

Uncompressed audio format
PCM (Pulse Code Modulation) encoding
High quality but large file size
Sample rates typically 44.1kHz or 48kHz
16-bit or 24-bit depth

MP3 Format

Lossy compressed audio format
MPEG-1 Audio Layer III standard
Smaller file size with good quality
Uses perceptual coding
Bitrates from 32kbps to 320kbps

The Conversion Process

1. Audio Decoding

The WAV file is decoded from its PCM format into raw audio samples. This involves reading the audio data header to understand the sample rate, bit depth, and number of channels.

2. Psychoacoustic Analysis

The encoder performs a Fast Fourier Transform (FFT) to analyze the frequency content of the audio. It identifies sounds that are less perceptible to human hearing based on:

Frequency masking (louder sounds mask quieter ones at similar frequencies)
Temporal masking (sounds mask other sounds that occur shortly before or after)
Absolute threshold of hearing (sounds too quiet to be heard)

3. MDCT Transformation

The audio is divided into frames and transformed using Modified Discrete Cosine Transform (MDCT). This converts the time-domain audio samples into frequency-domain coefficients.

4. Quantization

Based on the psychoacoustic model, the encoder allocates bits to different frequency bands. More bits are given to perceptually important frequencies, while less important frequencies are coarsely quantized or discarded.

5. Huffman Coding

The quantized coefficients are compressed using Huffman coding, which assigns shorter codes to more frequent values.

6. Frame Packing

The processed data is packed into MP3 frames with headers containing metadata like bitrate and sampling rate. Each frame is 26ms of audio (1152 samples at 44.1kHz).

Our Implementation

This website uses the LAME encoder (via lamejs) running entirely in your browser. Key features:

Client-side processing ensures your audio never leaves your device
Supports variable bitrate (VBR) and constant bitrate (CBR) encoding
Implements fast psychoacoustic models optimized for web performance
Uses Web Workers to prevent UI freezing during conversion

Technical Specifications

Encoder Version	LAME 3.100 (JavaScript port)
Supported Sample Rates	8kHz, 11.025kHz, 12kHz, 16kHz, 22.05kHz, 24kHz, 32kHz, 44.1kHz, 48kHz
Supported Bitrates	32kbps to 320kbps
Channel Modes	Mono, Stereo, Joint Stereo

WAV to MP3 Conversion Technology