Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The usual trick is to use domain-specific knowledge to translate that asymmetry to uniformity.

E.g.: Suppose the data has high-order structure but is locally uniform (very common, comes about because of noise-inducing processes). Compute and store centroids. Those are more uniform than your underlying data, and since you don't have many it doesn't really matter anyway. Each vector is stored as a centroid index and a vector offset (SoA, not AoS). The indices are compressible with your favorite entropic integer scheme (if you don't need to preserve order you can do better), and the offsets are now approximately uniform by assumption, so you can use your favorite sphere strategy from the literature.



Intelligence is compression




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: