WAV to MP3 Conversion Technology
Understanding Audio Formats
WAV Format
- Uncompressed audio format
- PCM (Pulse Code Modulation) encoding
- High quality but large file size
- Sample rates typically 44.1kHz or 48kHz
- 16-bit or 24-bit depth
MP3 Format
- Lossy compressed audio format
- MPEG-1 Audio Layer III standard
- Smaller file size with good quality
- Uses perceptual coding
- Bitrates from 32kbps to 320kbps
The Conversion Process
1. Audio Decoding
The WAV file is decoded from its PCM format into raw audio samples. This involves reading the audio data header to understand the sample rate, bit depth, and number of channels.
2. Psychoacoustic Analysis
The encoder performs a Fast Fourier Transform (FFT) to analyze the frequency content of the audio. It identifies sounds that are less perceptible to human hearing based on:
- Frequency masking (louder sounds mask quieter ones at similar frequencies)
- Temporal masking (sounds mask other sounds that occur shortly before or after)
- Absolute threshold of hearing (sounds too quiet to be heard)
3. MDCT Transformation
The audio is divided into frames and transformed using Modified Discrete Cosine Transform (MDCT). This converts the time-domain audio samples into frequency-domain coefficients.
4. Quantization
Based on the psychoacoustic model, the encoder allocates bits to different frequency bands. More bits are given to perceptually important frequencies, while less important frequencies are coarsely quantized or discarded.
5. Huffman Coding
The quantized coefficients are compressed using Huffman coding, which assigns shorter codes to more frequent values.
6. Frame Packing
The processed data is packed into MP3 frames with headers containing metadata like bitrate and sampling rate. Each frame is 26ms of audio (1152 samples at 44.1kHz).
Our Implementation
This website uses the LAME encoder (via lamejs) running entirely in your browser. Key features:
- Client-side processing ensures your audio never leaves your device
- Supports variable bitrate (VBR) and constant bitrate (CBR) encoding
- Implements fast psychoacoustic models optimized for web performance
- Uses Web Workers to prevent UI freezing during conversion
Technical Specifications
| Encoder Version | LAME 3.100 (JavaScript port) |
| Supported Sample Rates | 8kHz, 11.025kHz, 12kHz, 16kHz, 22.05kHz, 24kHz, 32kHz, 44.1kHz, 48kHz |
| Supported Bitrates | 32kbps to 320kbps |
| Channel Modes | Mono, Stereo, Joint Stereo |