Sure, for Tensorflow and PyTorch, you can use shims. But there are countless of other examples. I'm not saying it's impossible, but even you have to agree that the tooling is not as mature and extensive as with nvidia?
AMD upstream their changes: no need for shims for either library. From my point as a dabbler in the arts - most implementations [1] of ML papers use pytorch, tensorflow or other high-level libraries. I'm yet to see one implemented in CUDA directly.
1. e.g. Huggingface transformers or stable diffusion