I agree that my explanation is simplified, but not in a way that affects my point. Maybe I didn't explain it well, so I'll try again.
Imagine you're trying to draw a flat map of the curved surface of the earth. There are many possible ways you could do it, with differing tradeoffs between preserving sizes, preserving angles, and so on. But any faithful map should respect basic geometric constraints. If your map shows the Statue of Liberty as being in Manhattan, it's quite simply wrong.
Likewise, we can think about which distortions can possibly arise from a physical model of the eye. Regardless how the curved surface of the retina is shaped, light travels in straight lines until it hits the pupil. That means the geometric relationships between distant and nearby points -- that is, which points "line up" with others -- are constrained to match those straight lines. FOVO does not obey these constraints (as mentioned in their explanation, and as demonstrated in their examples).
> 1. The light falls onto a curved surface of the retina, not a flat screen behind it, but more importantly
I didn't say anything about how the retina is shaped, because it doesn't matter. If a distant point A, a nearby point B, and the viewpoint V are all collinear, then A can't be visible because it's obscured by B. The exact details of where B appears in the resulting image would depend on the shape of the retina, true. But wherever B is, A must be mapped to the same point, which means A must be occluded unless the light rays are traveling in curved paths outside the eye.
> 2. Our brains interpret the light that falls and creates the experience of a 3D space in front of us.
Agreed. But your brain can only perceive things that are based on some transformation of the image that is projected on the retina. If it creates the "experience" of seeing an object whose light is physically obstructed, then it's not a perception, it's a hallucination.
> If you actually simply flattened out the retina and mapped out, point to point, the light that falls, it would look a bit like their linear perspective example.
Not really; it depends on how you do the mapping. Their example shows what an image projected onto an extremely wide flat retina would look like. This is mathematically equivalent to a "gnomonic" map projection, which hugely distorts shapes and distances as you get farther from the center. You could use any other projection (defined as a function mapping between view directions and image coordinates) without breaking the geometric perspective relationships, and without needing to modify the 3D geometry of the scene at rendering time.
You're right, of course. Our visual perception of the space can't cause things to become occluded when they weren't before, and vice-versa.
So maybe that ends up being a side-effect of an implementation detail, and in their zeal to explain how, to achieve a realistic-feeling FOV, they had to change how light bends through space, they ended up harping too much on this side-effect as a way to describe what they meant.
I think the question is simply whether a better transform exists using just the 2D image. If one exists that feels like a not-too-distorted wide-angle view like our natural vision, then people should be using that. If this turns out to be the best way to achieve that currently, and it has a slight side-effect of a change in occlusion, then this seems like a good thing.
Imagine you're trying to draw a flat map of the curved surface of the earth. There are many possible ways you could do it, with differing tradeoffs between preserving sizes, preserving angles, and so on. But any faithful map should respect basic geometric constraints. If your map shows the Statue of Liberty as being in Manhattan, it's quite simply wrong.
Likewise, we can think about which distortions can possibly arise from a physical model of the eye. Regardless how the curved surface of the retina is shaped, light travels in straight lines until it hits the pupil. That means the geometric relationships between distant and nearby points -- that is, which points "line up" with others -- are constrained to match those straight lines. FOVO does not obey these constraints (as mentioned in their explanation, and as demonstrated in their examples).
> 1. The light falls onto a curved surface of the retina, not a flat screen behind it, but more importantly
I didn't say anything about how the retina is shaped, because it doesn't matter. If a distant point A, a nearby point B, and the viewpoint V are all collinear, then A can't be visible because it's obscured by B. The exact details of where B appears in the resulting image would depend on the shape of the retina, true. But wherever B is, A must be mapped to the same point, which means A must be occluded unless the light rays are traveling in curved paths outside the eye.
> 2. Our brains interpret the light that falls and creates the experience of a 3D space in front of us.
Agreed. But your brain can only perceive things that are based on some transformation of the image that is projected on the retina. If it creates the "experience" of seeing an object whose light is physically obstructed, then it's not a perception, it's a hallucination.
> If you actually simply flattened out the retina and mapped out, point to point, the light that falls, it would look a bit like their linear perspective example.
Not really; it depends on how you do the mapping. Their example shows what an image projected onto an extremely wide flat retina would look like. This is mathematically equivalent to a "gnomonic" map projection, which hugely distorts shapes and distances as you get farther from the center. You could use any other projection (defined as a function mapping between view directions and image coordinates) without breaking the geometric perspective relationships, and without needing to modify the 3D geometry of the scene at rendering time.