Projects
Reading
People
Chat
SU\G
(𝔸)
/K·U
Projects
Reading
People
Chat
Sign Up
Light
Dark
System
Delphi: Towards Machine Ethics and Norms
Liwei Jiang
,
Jena D. Hwang
,
Chandra Bhagavatula
,
Ronan Le Bras
,
Maxwell Forbes
,
Jon Borchardt
,
Jenny Liang
,
Oren Etzioni
,
Maarten Sap
,
Yejin Choi
Type:
Preprint
Publication Date:
2021-10-14
Citations:
66
View Publication
Share
Locations
arXiv (Cornell University) -
View
Similar Works
Action
Title
Year
Authors
+
Can Machines Learn Morality? The Delphi Experiment
2021
Liwei Jiang
Jena D. Hwang
Chandra Bhagavatula
Ronan Le Bras
Jenny Liang
Jesse Dodge
Keisuke Sakaguchi
Maxwell Forbes
Jon Borchardt
Saadia Gabriel
+
Aligning AI With Shared Human Values
2020
Dan Hendrycks
Collin Burns
Steven Basart
Andrew Critch
Jerry Li
Dawn Song
Jacob Steinhardt
+
Aligning AI With Shared Human Values
2020
Dan Hendrycks
Collin Burns
Steven Basart
Andrew Critch
Jerry Li
Dawn Song
Jacob Steinhardt
+
PDF
Chat
ClarityEthic: Explainable Moral Judgment Utilizing Contrastive Ethical Insights from Large Language Models
2024
Yanyun Sun
Gao Wei
Jing Ma
Hongzhan Lin
Ziyang Luo
Wenxuan Zhang
+
A Word on Machine Ethics: A Response to Jiang et al. (2021)
2021
Zeerak Talat
Hagen Blix
Josef Valvoda
Maya Indira Ganesh
Ryan Cotterell
Adina Williams
+
SCRUPLES: A Corpus of Community Ethical Judgments on 32, 000 Real-Life Anecdotes.
2020
Nicholas Lourie
Ronan Le Bras
Yejin Choi
+
Does Moral Code have a Moral Code? Probing Delphi’s Moral Philosophy
2022
Kathleen Fraser
Svetlana Kiritchenko
Esma Balkır
+
Scruples: A Corpus of Community Ethical Judgments on 32,000 Real-Life Anecdotes
2020
Nicholas Lourie
Ronan Le Bras
Yejin Choi
+
PDF
Chat
SCRUPLES: A Corpus of Community Ethical Judgments on 32,000 Real-Life Anecdotes
2021
Nicholas Lourie
Ronan Le Bras
Yejin Choi
+
Does Moral Code Have a Moral Code? Probing Delphi's Moral Philosophy
2022
Kathleen Fraser
Svetlana Kiritchenko
Esma Balkır
+
PDF
Chat
Informed AI Regulation: Comparing the Ethical Frameworks of Leading LLM Chatbots Using an Ethics-Based Audit to Assess Moral Reasoning and Normative Values
2024
Jon Chun
Katherine Elkins
+
Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through the Lens of Moral Theories?
2023
Jingyan Zhou
Minda Hu
Junan Li
Xiaoying Zhang
Xixin Wu
Irwin King
Helen Meng
+
BERT has a Moral Compass: Improvements of ethical and moral values of machines
2019
Patrick Schramowski
Cigdem Turan
Sophie Jentzsch
Constantin A. Rothkopf
Kristian Kersting
+
PDF
Chat
Is ETHICS about ethics? Evaluating the ETHICS benchmark
2024
Leif Hancox-Li
Borhane Blili-Hamelin
+
PDF
Chat
Exploring and steering the moral compass of Large Language Models
2024
Alejandro Tlaie
+
PDF
Chat
Some Issues in Predictive Ethics Modeling: An Annotated Contrast Set of "Moral Stories"
2024
Ben Fitzgerald
+
PDF
Chat
MoralBench: Moral Evaluation of LLMs
2024
Jianchao Ji
Yutong Chen
Mingyu Jin
Wujiang Xu
Wenyue Hua
Yongfeng Zhang
+
STREAM: Social data and knowledge collective intelligence platform for TRaining Ethical AI Models
2023
Yuwei Wang
Enmeng Lu
Zizhe Ruan
Yao Liang
Yi Zeng
+
PDF
Chat
Ethical Reasoning and Moral Value Alignment of LLMs Depend on the Language we Prompt them in
2024
Utkarsh Agarwal
Kumar Tanmay
Aditi Khandelwal
Monojit Choudhury
+
PDF
Chat
The Moral Turing Test: Evaluating Human-LLM Alignment in Moral Decision-Making
2024
Basile Garcia
Crystal Qian
Stefano Palminteri
Works That Cite This (37)
Action
Title
Year
Authors
+
A General Language Assistant as a Laboratory for Alignment
2021
Amanda Askell
Yuntao Bai
Anna Chen
Dawn Drain
Deep Ganguli
Tom Henighan
Andy Jones
Nicholas Joseph
Ben Mann
Nova DasSarma
+
PDF
Chat
Speaking Multiple Languages Affects the Moral Bias of Language Models
2023
Katharina Haemmerl
Bjoern Deiseroth
Patrick Schramowski
Jindřich Libovický
Constantin A. Rothkopf
Alexander Fraser
Kristian Kersting
+
PDF
Chat
Mapping Topics in 100,000 Real-Life Moral Dilemmas
2022
Tuan Dung Nguyen
Georgiana Lyall
Alasdair Tran
Minjeong Shin
Nicholas George Carroll
Colin Klein
Lexing Xie
+
MoralDial: A Framework to Train and Evaluate Moral Dialogue Systems via Moral Discussions
2023
Hao Sun
Zhexin Zhang
Fei Mi
Yasheng Wang
Wei Liu
Jianwei Cui
Bin Wang
Qun Liu
Minlie Huang
+
You are what you're for: Essentialist categorization in large language models
2023
Siying Zhang
Selena She
Tobias Gerstenberg
David Rosé
+
PDF
Chat
The Blame Game: Understanding Blame Assignment in Social Media
2023
Rui-Jie Xi
Munindar P. Singh
+
I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation
2023
Chandra Bhagavatula
Jena D. Hwang
Doug Downey
Ronan Le Bras
Ximing Lu
Lianhui Qin
Keisuke Sakaguchi
Swabha Swayamdipta
Peter West
Yejin Choi
+
PDF
Chat
SafeText: A Benchmark for Exploring Physical Safety in Language Models
2022
Sharon Levy
Emily Allaway
Melanie Subbiah
Lydia B. Chilton
Desmond Upton Patton
Kathleen McKeown
William Yang Wang
+
Knowledge of cultural moral norms in large language models
2023
Aida Ramezani
Yang Xu
+
PDF
Chat
ProsocialDialog: A Prosocial Backbone for Conversational Agents
2022
Hyunwoo Kim
Youngjae Yu
Liwei Jiang
Ximing Lu
Daniel Khashabi
Gunhee Kim
Yejin Choi
Maarten Sap
Works Cited by This (19)
Action
Title
Year
Authors
+
HellaSwag: Can a Machine Really Finish Your Sentence?
2019
Rowan Zellers
Ari Holtzman
Yonatan Bisk
Ali Farhadi
Yejin Choi
+
Social IQa: Commonsense Reasoning about Social Interactions
2019
Maarten Sap
Hannah Rashkin
Derek Chen
Ronan Le Bras
Yejin Choi
+
Cosmos QA: Machine Reading Comprehension with Contextual Commonsense Reasoning
2019
Lifu Huang
Ronan Le Bras
Chandra Bhagavatula
Yejin Choi
+
PDF
Chat
WinoGrande: An Adversarial Winograd Schema Challenge at Scale
2020
Keisuke Sakaguchi
Ronan Le Bras
Chandra Bhagavatula
Yejin Choi
+
PDF
Chat
PIQA: Reasoning about Physical Commonsense in Natural Language
2020
Yonatan Bisk
Rowan Zellers
Ronan Le Bras
Jianfeng Gao
Yejin Choi
+
Social Bias Frames: Reasoning about Social and Power Implications of Language
2020
Maarten Sap
Saadia Gabriel
Lianhui Qin
Dan Jurafsky
Noah A. Smith
Yejin Choi
+
Give Me Convenience and Give Her Death: Who Should Decide What Uses of NLP are Appropriate, and on What Basis?
2020
Kobi Leins
Jey Han Lau
Timothy Baldwin
+
Language (Technology) is Power: A Critical Survey of “Bias” in NLP
2020
Su Lin Blodgett
Solon Barocas
Hal Daumé
Hanna Wallach
+
PDF
Chat
Model Cards for Model Reporting
2019
Margaret Mitchell
Simone Wu
Andrew Zaldivar
Parker Barnes
Lucy Vasserman
Ben Hutchinson
Elena Spitzer
Inioluwa Deborah Raji
Timnit Gebru
+
Social Chemistry 101: Learning to Reason about Social and Moral Norms
2020
Maxwell Forbes
Jena D. Hwang
Vered Shwartz
Maarten Sap
Yejin Choi