Delphi: Towards Machine Ethics and Norms

Type: Preprint

Publication Date: 2021-10-14

Citations: 66

Locations

  • arXiv (Cornell University) - View

Similar Works

Action Title Year Authors
+ Can Machines Learn Morality? The Delphi Experiment 2021 Liwei Jiang
Jena D. Hwang
Chandra Bhagavatula
Ronan Le Bras
Jenny Liang
Jesse Dodge
Keisuke Sakaguchi
Maxwell Forbes
Jon Borchardt
Saadia Gabriel
+ Aligning AI With Shared Human Values 2020 Dan Hendrycks
Collin Burns
Steven Basart
Andrew Critch
Jerry Li
Dawn Song
Jacob Steinhardt
+ Aligning AI With Shared Human Values 2020 Dan Hendrycks
Collin Burns
Steven Basart
Andrew Critch
Jerry Li
Dawn Song
Jacob Steinhardt
+ PDF Chat ClarityEthic: Explainable Moral Judgment Utilizing Contrastive Ethical Insights from Large Language Models 2024 Yanyun Sun
Gao Wei
Jing Ma
Hongzhan Lin
Ziyang Luo
Wenxuan Zhang
+ A Word on Machine Ethics: A Response to Jiang et al. (2021) 2021 Zeerak Talat
Hagen Blix
Josef Valvoda
Maya Indira Ganesh
Ryan Cotterell
Adina Williams
+ SCRUPLES: A Corpus of Community Ethical Judgments on 32, 000 Real-Life Anecdotes. 2020 Nicholas Lourie
Ronan Le Bras
Yejin Choi
+ Does Moral Code have a Moral Code? Probing Delphi’s Moral Philosophy 2022 Kathleen Fraser
Svetlana Kiritchenko
Esma Balkır
+ Scruples: A Corpus of Community Ethical Judgments on 32,000 Real-Life Anecdotes 2020 Nicholas Lourie
Ronan Le Bras
Yejin Choi
+ PDF Chat SCRUPLES: A Corpus of Community Ethical Judgments on 32,000 Real-Life Anecdotes 2021 Nicholas Lourie
Ronan Le Bras
Yejin Choi
+ Does Moral Code Have a Moral Code? Probing Delphi's Moral Philosophy 2022 Kathleen Fraser
Svetlana Kiritchenko
Esma Balkır
+ PDF Chat Informed AI Regulation: Comparing the Ethical Frameworks of Leading LLM Chatbots Using an Ethics-Based Audit to Assess Moral Reasoning and Normative Values 2024 Jon Chun
Katherine Elkins
+ Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through the Lens of Moral Theories? 2023 Jingyan Zhou
Minda Hu
Junan Li
Xiaoying Zhang
Xixin Wu
Irwin King
Helen Meng
+ BERT has a Moral Compass: Improvements of ethical and moral values of machines 2019 Patrick Schramowski
Cigdem Turan
Sophie Jentzsch
Constantin A. Rothkopf
Kristian Kersting
+ PDF Chat Is ETHICS about ethics? Evaluating the ETHICS benchmark 2024 Leif Hancox-Li
Borhane Blili-Hamelin
+ PDF Chat Exploring and steering the moral compass of Large Language Models 2024 Alejandro Tlaie
+ PDF Chat Some Issues in Predictive Ethics Modeling: An Annotated Contrast Set of "Moral Stories" 2024 Ben Fitzgerald
+ PDF Chat MoralBench: Moral Evaluation of LLMs 2024 Jianchao Ji
Yutong Chen
Mingyu Jin
Wujiang Xu
Wenyue Hua
Yongfeng Zhang
+ STREAM: Social data and knowledge collective intelligence platform for TRaining Ethical AI Models 2023 Yuwei Wang
Enmeng Lu
Zizhe Ruan
Yao Liang
Yi Zeng
+ PDF Chat Ethical Reasoning and Moral Value Alignment of LLMs Depend on the Language we Prompt them in 2024 Utkarsh Agarwal
Kumar Tanmay
Aditi Khandelwal
Monojit Choudhury
+ PDF Chat The Moral Turing Test: Evaluating Human-LLM Alignment in Moral Decision-Making 2024 Basile Garcia
Crystal Qian
Stefano Palminteri

Works That Cite This (37)

Action Title Year Authors
+ A General Language Assistant as a Laboratory for Alignment 2021 Amanda Askell
Yuntao Bai
Anna Chen
Dawn Drain
Deep Ganguli
Tom Henighan
Andy Jones
Nicholas Joseph
Ben Mann
Nova DasSarma
+ PDF Chat Speaking Multiple Languages Affects the Moral Bias of Language Models 2023 Katharina Haemmerl
Bjoern Deiseroth
Patrick Schramowski
Jindřich Libovický
Constantin A. Rothkopf
Alexander Fraser
Kristian Kersting
+ PDF Chat Mapping Topics in 100,000 Real-Life Moral Dilemmas 2022 Tuan Dung Nguyen
Georgiana Lyall
Alasdair Tran
Minjeong Shin
Nicholas George Carroll
Colin Klein
Lexing Xie
+ MoralDial: A Framework to Train and Evaluate Moral Dialogue Systems via Moral Discussions 2023 Hao Sun
Zhexin Zhang
Fei Mi
Yasheng Wang
Wei Liu
Jianwei Cui
Bin Wang
Qun Liu
Minlie Huang
+ You are what you're for: Essentialist categorization in large language models 2023 Siying Zhang
Selena She
Tobias Gerstenberg
David Rosé
+ PDF Chat The Blame Game: Understanding Blame Assignment in Social Media 2023 Rui-Jie Xi
Munindar P. Singh
+ I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation 2023 Chandra Bhagavatula
Jena D. Hwang
Doug Downey
Ronan Le Bras
Ximing Lu
Lianhui Qin
Keisuke Sakaguchi
Swabha Swayamdipta
Peter West
Yejin Choi
+ PDF Chat SafeText: A Benchmark for Exploring Physical Safety in Language Models 2022 Sharon Levy
Emily Allaway
Melanie Subbiah
Lydia B. Chilton
Desmond Upton Patton
Kathleen McKeown
William Yang Wang
+ Knowledge of cultural moral norms in large language models 2023 Aida Ramezani
Yang Xu
+ PDF Chat ProsocialDialog: A Prosocial Backbone for Conversational Agents 2022 Hyunwoo Kim
Youngjae Yu
Liwei Jiang
Ximing Lu
Daniel Khashabi
Gunhee Kim
Yejin Choi
Maarten Sap

Works Cited by This (19)

Action Title Year Authors
+ HellaSwag: Can a Machine Really Finish Your Sentence? 2019 Rowan Zellers
Ari Holtzman
Yonatan Bisk
Ali Farhadi
Yejin Choi
+ Social IQa: Commonsense Reasoning about Social Interactions 2019 Maarten Sap
Hannah Rashkin
Derek Chen
Ronan Le Bras
Yejin Choi
+ Cosmos QA: Machine Reading Comprehension with Contextual Commonsense Reasoning 2019 Lifu Huang
Ronan Le Bras
Chandra Bhagavatula
Yejin Choi
+ PDF Chat WinoGrande: An Adversarial Winograd Schema Challenge at Scale 2020 Keisuke Sakaguchi
Ronan Le Bras
Chandra Bhagavatula
Yejin Choi
+ PDF Chat PIQA: Reasoning about Physical Commonsense in Natural Language 2020 Yonatan Bisk
Rowan Zellers
Ronan Le Bras
Jianfeng Gao
Yejin Choi
+ Social Bias Frames: Reasoning about Social and Power Implications of Language 2020 Maarten Sap
Saadia Gabriel
Lianhui Qin
Dan Jurafsky
Noah A. Smith
Yejin Choi
+ Give Me Convenience and Give Her Death: Who Should Decide What Uses of NLP are Appropriate, and on What Basis? 2020 Kobi Leins
Jey Han Lau
Timothy Baldwin
+ Language (Technology) is Power: A Critical Survey of “Bias” in NLP 2020 Su Lin Blodgett
Solon Barocas
Hal Daumé
Hanna Wallach
+ PDF Chat Model Cards for Model Reporting 2019 Margaret Mitchell
Simone Wu
Andrew Zaldivar
Parker Barnes
Lucy Vasserman
Ben Hutchinson
Elena Spitzer
Inioluwa Deborah Raji
Timnit Gebru
+ Social Chemistry 101: Learning to Reason about Social and Moral Norms 2020 Maxwell Forbes
Jena D. Hwang
Vered Shwartz
Maarten Sap
Yejin Choi