New secret math benchmark stumps AI models and PhDs alike
On Friday, research organization Epoch AI released FrontierMath, a new mathematics benchmark that has been turning heads in the AI world because it contains hundreds of expert-level problems that leading AI models solve less than 2 percent of the time, according to Epoch AI. The benchmark tests AI language models (such as GPT-4o, which powers…