Measuring Progress on Scalable Oversight for Large Language Models

Samuel R. Bowman, Jeeyoon Hyun, Ethan Perez, Edwin Chen, Craig Pettit, Scott Heiner, Kamile Lukosuite, Amanda Askell, Andy Jones, Anna Chen

Type: Preprint

Publication Date: 2022-01-01

Citations: 14

DOI: https://doi.org/10.48550/arxiv.2211.03540

View Publication

Locations

arXiv (Cornell University) - View
DataCite API - View

Similar Works

Action	Title	Year	Authors
+ PDF Chat	Leveraging LLMs for Dialogue Quality Measurement	2024	Jinghan Jia Abi Komma Timothy Leffel Xujun Peng Ajay Nagesh Tamer Soliman Aram Galstyan Anoop Kumar
+	Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena	2023	Lianmin Zheng Wei-Lin Chiang Ying Sheng Siyuan Zhuang Zhanghao Wu Yonghao Zhuang Lin Zi Zhuohan Li Dacheng Li Eric P. Xing
+ PDF Chat	GRILLBot In Practice: Lessons and Tradeoffs Deploying Large Language Models for Adaptable Conversational Task Assistants	2024	Sophie Fischer Carlos Gemmell Niklas Tecklenburg Iain Mackie Federico Rossetto Jeff Dalton
+ PDF Chat	GRILLBot In Practice: Lessons and Tradeoffs Deploying Large Language Models for Adaptable Conversational Task Assistants	2024	Sophie Fischer Carlos Gemmell Niklas Tecklenburg I C Mackie F. Rossetto Jeff Dalton
+ PDF Chat	Large Language Models as Misleading Assistants in Conversation	2024	Bingzhe Hou Kejian Shi Jason Phang James Aung Steven Adler Rosie Campbell
+ PDF Chat	Comparing Rationality Between Large Language Models and Humans: Insights and Open Questions	2024	Dana Alsagheer Rabimba Karanjai Nour Diallo Weidong Shi Lu Yang Suha Beydoun Qiaoning Zhang
+ PDF Chat	Regulation of Language Models With Interpretability Will Likely Result In A Performance Trade-Off	2024	Eoin M. Kenny Julie Shah
+ PDF Chat	Large Language Models Must Be Taught to Know What They Don't Know	2024	Sanyam Kapoor Nate Gruver Manley Roberts Katherine M. Collins Arka Pal Umang Bhatt Adrian Weller Samuel Dooley Micah Goldblum Andrew Gordon Wilson
+	AI Revolution on Chat Bot: Evidence from a Randomized Controlled Experiment	2024	Sida Peng Wojciech Swiatek Allen Gao Paul Cullivan Haoge Chang
+ PDF Chat	Prompted LLMs as Chatbot Modules for Long Open-domain Conversation	2023	Gibbeum Lee Volker Hartmann Jong-Ho Park Dimitris Papailiopoulos Kangwook Lee
+	Prompted LLMs as Chatbot Modules for Long Open-domain Conversation	2023	Gibbeum Lee Volker Hartmann Jong-Ho Park Dimitris Papailiopoulos Kangwook Lee
+ PDF Chat	Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond	2024	Jingfeng Yang Hongye Jin Ruixiang Tang Xiaotian Han Qizhang Feng Haoming Jiang Shaochen Zhong Bing Yin Xia Hu
+	ExpertPrompting: Instructing Large Language Models to be Distinguished Experts	2023	Benfeng Xu Yang An Junyang Lin Quan Wang Chang Zhou Yongdong Zhang Zhendong Mao
+ PDF Chat	FB-Bench: A Fine-Grained Multi-Task Benchmark for Evaluating LLMs' Responsiveness to Human Feedback	2024	Yan Li Miao Zheng Fan Yang Guosheng Dong Bin Cui Weipeng Chen Zenan Zhou Wentao Zhang
+	Can we trust the evaluation on ChatGPT?	2023	Rachith Aiyappa Jisun An Haewoon Kwak Yong‐Yeol Ahn
+ PDF Chat	Enhancing Chat Language Models by Scaling High-quality Instructional Conversations	2023	Ning Ding Yulin Chen Bokai Xu Yujia Qin Shengding Hu Zhiyuan Liu Maosong Sun Bowen Zhou
+	Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback	2023	Baolin Peng Michel Galley Pengcheng He Hao Cheng Yujia Xie Yu Hu Qiuyuan Huang Lars Lidén Yu Zhou Weizhu Chen
+	Towards Reliable and Fluent Large Language Models: Incorporating Feedback Learning Loops in QA Systems	2023	Dongyub Lee Taesun Whang Chanhee Lee Heuiseok Lim
+ PDF Chat	HumanRankEval: Automatic Evaluation of LMs as Conversational Assistants	2024	Milan Gritta Γεράσιμος Λάμπουρας Ignacio Iacobacci
+	Evaluating Human-Language Model Interaction	2022	Mina Lee Megha Srivastava Amelia Hardy John Thickstun Esin Durmus Ashwin Paranjape Ines Gerard-Ursin Xiang Lisa Li Faisal Ladhak Frieda Rong

Works That Cite This (5)

Action	Title	Year	Authors
+ PDF Chat	The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values	2023	Hannah Rose Kirk Andrew Bean Bertie Vidgen Paul Röttger Scott A. Hale
+	Taking Advice from ChatGPT	2023	Peter Zhang
+ PDF Chat	Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation	2023	Patrick Fernandes Aman Madaan Emmy Liu António Farinhas Pedro Henrique Martins Amanda Bertsch José G. C. de Souza Shuyan Zhou Tongshuang Wu Graham Neubig
+	Integrating measures of replicability into literature search: Challenges and opportunities	2023	Chuhao Wu Tatiana Chakravorti John D. Carroll Sarah Rajtmajer
+ PDF Chat	Evaluating Superhuman Models with Consistency Checks	2024	Lukas Fluri Daniel Paleka Florian Tramèr

Works Cited by This (0)

Action	Title	Year	Authors