That's been my experience. However when fallback to CPU happens, it sometimes end up making a specific graph execution slower. But that's explicitly mentioned by the warning and pretty much expected.
Yes, this is my experience. Many off the shelf models still don't work, but several of my own models work great as long as they don't use unsupported operators.
Yes, I am not sure at what extent is MPS a viable alternative to CUDA. You seem to write a lot about ML models. Do you have a detailed write about this subject?