On the Reasoning Capacity of AI Models and How to Quantify It
On the Reasoning Capacity of AI Models and How to Quantify It
Recent advances in Large Language Models (LLMs) have intensified the debate surrounding the fundamental nature of their reasoning capabilities. While achieving high performance on benchmarks such as GPQA and MMLU, these models exhibit limitations in more complex reasoning tasks, highlighting the need for more rigorous evaluation methodologies. We propose a …