Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As far as I know, there is no widely accepted explanation for context rot.

Numerical error in long sequences of query-key dot-products may be a key factor.





That should be easy to test: test a 16 bit model on various benchmarks, once with fresh context and once with the context filled up with irrelevant tokens. Record the relative performance degradation, and then do the same for a quantized model. Compare whether the quantized model has a significant relatively larger performance drop from context rot. If so, numerical error should be the cause.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: