“Fill less than 1% of its space” becomes a very counter intuitive statement in any case when discussing high dimensions. If you consider a unit n-sphere bounded by a unit cube, the fraction occupied by the sphere vanishes for high n. (Aside: Strangely, the relationship is non monotonic and is actually maximal for n=6). For n=100 the volume of the unit 100-sphere is around 10^-40 (and you certainly cannot fit a second sphere in this cube…) so its not surprising that the gains to be made in improving packing can be so large.
> (Aside: Strangely, the relationship is non monotonic and is actually maximal for n=6)
For this aside I crave a citation.
When n=1 the sphere fit is 100% as both simplex and sphere are congruent in that dimension. And dismissing n=0 as degenerate (fit is undefined there I suppose: dividing by zero measure and all that) that (first) dimension should be maximal with a steady decline thereafter thus also monotonic.
This looks to have been a conflation by the GP between the volume of the unit sphere itself and its ratio to the volume of its bounding cube (which is not the unit cube.) The volume of the sphere does top out at an unintuitive dimension, but indeed the ratio of the two is always decreasing - and intuitively, each additional dimension just adds more space between the corners of the cube and the face of the sphere.
You don't need to involve the hypercube at all. You can just look at the volume of a hypersphere (n-ball). The dimension where the maximal volume of the n-ball lives depends on the radius, and for the unit n-ball, the max is at 5D, not 6D. As D->inf, then V->inf too.
This relationship doesn't happen to the hypercube btw. Really, it is about the definition of each object. The volume of the hypercube just continues to grow. So of course the ratio is going to explode...
As an extra fun tidbit, I'll add that when we work with statistics some extra wildness appears. For example, there is a huge difference between the geometry of the uniform distribution and the gaussian (normal) distribution, both of which can be thought of as spheres. Take any two points in each distribution and draw a line connecting them and interpolate along that line. For the unit distribution, everything will work as expected. But for the gaussian distribution you'll find that your interpolated points are not representative of the distribution! That's because the normal distribution is "hollow". In math speak, we say "the density lies along the shell." Instead, you have to interpolate along the geodesic. Which is a fancy word to mean the definition of a line but aware of the geometry (i.e. you're traveling on the surface). Easiest way to visualize this is thinking about interpolating between two cities on Earth. If you draw a straight line you're gonna get a lot of dirt. Instead, if you interpolate along the surface you're going to get much better results, even if that includes ocean, barren land, and... some cities and towns and other things. That's a lot more representative than what's underground.
I’m familiar with this example of hyper-geometry. Put more abstractly, my intuition always said something like “the volume of hyper geometric shapes becomes more distributed about their surface as the number of dimensions increases”.