How TurboQuant uses a random rotation to precompute its quantizer, and why skipping the training step changes the operational story.
How TurboQuant uses a random rotation to precompute its quantizer, and why skipping the training step changes the operational story.
Why a matrix of plus and minus ones does the work of a dense random rotation, in O(d log d) instead of O(d squared).