This is purely speculative but based on my own experience.
I firmly believe that real world performance is bottlenecked by cache before the slight difference in OoE/pipelines/speculation between Intel and AMD's CPUs. And the metric I really care about is the time it takes to read a cache line and the cache miss penalty. Those two fundamental pieces of the system are the ultimate bottleneck that developers can't deal with.
It's really easy to construct cases where one CPU beats another in either single or multithreaded performance. The hard thing to make a judgement on is how it looks in real world software, and the more cumulative benchmarks are more telling of the quality of the software you bench other than the CPU. Particularly because optimizing for cache is so difficult. It's trivial to hit the memory wall and decide that's it, but it's really hard to decide that your entire architecture was flawed, choice of language was wrong, choice of algorithm or data structure was wrong, etc, A/B two different implementations and pick one that's slightly faster. The ROI isn't there.
There's a philosophical point here though about the nature of "better" as it applies to the real world that makes my belief moot, because even if I'm right it when X% of software that people use runs Y% faster on a particular architecture it doesn't matter if it's because developers wrote and benched their code on that architecture. It's still "better."
All that said, I'm very happy with my Zen2 purchase.
real world performance includes software that can avoid cache misses (most games) and code that can’t (most compilers). It is good to make caches bigger and faster especially for less carefully engineered software which tends to have many layers of indirection, and a jit, and a garbage collector all putting pressure on the caches. smaller transistors helps us get bigger l3 caches which is something that helps zen2 compile code pretty fast! zen3 makes that l3 unified so it will be effectively bigger still
making l1 and l2 caches bigger is problematic as that also makes them slower.
I firmly believe that real world performance is bottlenecked by cache before the slight difference in OoE/pipelines/speculation between Intel and AMD's CPUs. And the metric I really care about is the time it takes to read a cache line and the cache miss penalty. Those two fundamental pieces of the system are the ultimate bottleneck that developers can't deal with.
It's really easy to construct cases where one CPU beats another in either single or multithreaded performance. The hard thing to make a judgement on is how it looks in real world software, and the more cumulative benchmarks are more telling of the quality of the software you bench other than the CPU. Particularly because optimizing for cache is so difficult. It's trivial to hit the memory wall and decide that's it, but it's really hard to decide that your entire architecture was flawed, choice of language was wrong, choice of algorithm or data structure was wrong, etc, A/B two different implementations and pick one that's slightly faster. The ROI isn't there.
There's a philosophical point here though about the nature of "better" as it applies to the real world that makes my belief moot, because even if I'm right it when X% of software that people use runs Y% faster on a particular architecture it doesn't matter if it's because developers wrote and benched their code on that architecture. It's still "better."
All that said, I'm very happy with my Zen2 purchase.