✨ TL;DR
This note demonstrates that TurboQuant, a recent quantization method, is a suboptimal special case of the earlier EDEN/DRIVE quantization schemes. EDEN consistently outperforms TurboQuant across all tested scenarios, often by more than one bit of precision.
The paper addresses a priority and technical relationship issue in the quantization literature. The recent TurboQuant work presents methods for quantizing high-dimensional vectors to low bit-widths, but its relationship to earlier work (DRIVE from NeurIPS 2021 and EDEN from ICML 2022) was not properly established. This creates confusion about the novelty and optimality of TurboQuant's approach. The authors aim to clarify that TurboQuant is actually a restricted version of EDEN that makes suboptimal design choices, particularly in how it sets key parameters and combines quantization steps.
The authors provide a theoretical analysis comparing TurboQuant's two variants to EDEN. They show that TurboQuant_mse is EDEN with a fixed scalar scale parameter S=1, which is generally suboptimal (though the optimal S for biased EDEN does approach 1 as dimension grows). For TurboQuant_prod, they demonstrate it combines a biased (b-1)-bit EDEN step with an unbiased 1-bit residual quantization, identifying three sources of suboptimality: using S=1 in the first step, using inferior 1-bit QJL instead of 1-bit EDEN for the residual, and chaining biased and unbiased steps rather than direct b-bit unbiased quantization. The authors support their theoretical claims with comprehensive experiments replicating all TurboQuant experiments.