Decoding Stumpers: Large Language Models vs. Human Problem-Solvers

Type: Article

Publication Date: 2023-01-01

Citations: 0

DOI: https://doi.org/10.18653/v1/2023.findings-emnlp.779

Locations

  • arXiv (Cornell University) - View - PDF

Similar Works

Action Title Year Authors
+ Decoding Stumpers: Large Language Models vs. Human Problem-Solvers 2023 Alon Goldstein
Miriam Havin
Roi Reichart
Ariel Goldstein
+ Competition-Level Problems are Effective LLM Evaluators 2023 Yiming Huang
Zhenghao Lin
X Liu
Yeyun Gong
Shuai Lu
Fangyu Lei
Yaobo Liang
Yelong Shen
Chen Lin
Nan Duan
+ PDF Chat Enhancing Mathematical Reasoning in LLMs by Stepwise Correction 2024 Zhenyu Wu
Qingkai Zeng
Zhihan Zhang
Zhaoxuan Tan
Chao Shen
Meng Jiang
+ PDF Chat Can Language Models Solve Olympiad Programming? 2024 Quan Shi
Michael T. Tang
Karthik Narasimhan
Shunyu Yao
+ PDF Chat SAAS: Solving Ability Amplification Strategy for Enhanced Mathematical Reasoning in Large Language Models 2024 Hyeonwoo Kim
Gyoungjin Gim
Yungi Kim
Jihoo Kim
Byungju Kim
Wonseok Lee
Chanjun Park
+ PDF Chat Boosting of Thoughts: Trial-and-Error Problem Solving with Large Language Models 2024 Sijia Chen
Baochun Li
Di Niu
+ PDF Chat Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning 2024 Joykirat Singh
Akshay Nambi
Vibhav Vineet
+ PDF Chat PECC: Problem Extraction and Coding Challenges 2024 Patrick Haller
Jonas Golde
Alan Akbik
+ PDF Chat ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection 2024 Yibo Yan
Shen Wang
Jiahao Huo
Hang Li
Boyan Li
Jiamin Su
Xiong Gao
Yifan Zhang
Tianlong Xu
Zhendong Chu
+ PDF Chat TypedThinker: Typed Thinking Improves Large Language Model Reasoning 2024 Danqing Wang
Jianxin Ma
Fei Fang
Lei Li
+ No Train Still Gain. Unleash Mathematical Reasoning of Large Language Models with Monte Carlo Tree Search Guided by Energy Function 2023 Haotian Xu
+ Evaluating the Deductive Competence of Large Language Models 2023 S. M. Seals
Valerie L. Shalin
+ INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models 2023 Yew Ken Chia
Pengfei Hong
Lidong Bing
Soujanya Poria
+ PDF Chat Low-Cost Language Models: Survey and Performance Evaluation on Python Code Generation 2024 Jéssica López Espejel
Mahaman Sanoussi Yahaya Alassan
Mérième Bouhandi
Walid Dahhane
El Hassane Ettifouri
+ PDF Chat Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning 2024 Ruosen Li
Ziming Luo
Xinya Du
+ PDF Chat A Survey on Large Language Models with some Insights on their Capabilities and Limitations 2025 Andrea Matarazzo
Riccardo Torlone
+ PDF Chat Puzzle Solving using Reasoning of Large Language Models: A Survey 2024 Panagiotis Giadikiaroglou
Maria Lymperaiou
Giorgos Filandrianos
Giorgos Stamou
+ PDF Chat Evaluating Mathematical Reasoning of Large Language Models: A Focus on Error Identification and Correction 2024 Xiaoyuan Li
Wenjie Wang
Moxin Li
Junrong Guo
Yang Zhang
Fuli Feng
+ PDF Chat MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains 2024 Guoli Yin
Haoping Bai
Shuang Ma
Nan Feng
Yanchao Sun
Zhaoyang Xu
Shen Ma
Jiarui Lu
Xiang Kong
Aonan Zhang
+ PDF Chat Auto-Evolve: Enhancing Large Language Model's Performance via Self-Reasoning Framework 2024 Krishna Aswani
H. J. Lu
Prachi Patankar
Priya Dhalwani
I Leng Tan
Jayant Ganeshmohan
Simon Lacasse

Works That Cite This (0)

Action Title Year Authors