Looking at LMarena which polymarket uses, I'm not surprised. Based on the little data there is (3k duels, it's possibly worse than Gemini, it lost more to Gemini 2.5 Pro than it won in direct duels). Not sure why the ELO is still higher, possibly GPT5 did more clearly better against bad models, which I don't care about.