目前主流的音频编解码器的一些指标
数据来源:wavtokenizer: an efficient acoustic discrete codec tokenizer for audio language modeling
计算公式:
假设码本大小2^n,每秒语音的hz=BW/(Nq*n),token=Nq* 每秒语音的hz , 每秒语音的hz =token/Nq
Model | Bandwidth | Nq ↓量化器数量 (number of quantizers.) | token/s ↓ | 码本大小 |
GT | – | – | – | |
DAC | 9.0kpbs | 9 | 900 | 1024 |
Encodec | 6.0kbps | 8 | 600 | 1024 |
Vocos | 6.0kbps | 8 | 600 | 1024 |
SpeechTokenizer | 6.0kpbs | 8 | 600 | 1024 |
DAC | 4.0kbps | 4 | 400 | 1024 |
HiFi-Codec | 3.0kbps | 4 | 400 | 2^7.5 |
HiFi-Codec | 4.0kbps | 4 | 300 | 2^13 |
Encodec | 3.0kbps | 4 | 300 | 1024 |
Vocos | 3.0kbps | 4 | 300 | 1024 |
SpeechTokenizer | 3.0kbps | 4 | 300 | 1024 |
WavTokenizer-small | 0.5kbps | 1 | 40 | 4096 |
WavTokenizer-small | 0.9kbps | 1 | 75 | 4096 |
Mini | 1.1kbps | 8 | 100 | 2048 |