GPT-4o was below the level of medical professionals on medical reasoning benchmarks GPT-5 (apparently Thinking medium) now far exceeds them. (Usual benchmark caveats apply)
elvis
elvis12.8. klo 20.58
GPT-5 on Multimodal Medical Reasoning On MedXpertQA MM, GPT-5 improves reasoning and understanding scores by +29.62% and +36.18% over GPT-4o. It surpasses pre-licensed human experts by +24.23% in reasoning and +29.40% in understanding.
101,28K