| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dist-epoch 65 days ago
	For those wondering where is this practically relevant - this is the basic metric used to compare quantization of various LLM models - what is the KL divergence of a 4-bit quantization versus an 8 bit one versus the original 16 bit one.

1 comments

abeppu 65 days ago

This is also the original way variational methods pick a parameterization of a model of known architecture which best matches some distribution which generated data but is not otherwise compactly expressible.

link