1/28/2024 0 Comments Moral integrityWe evaluate the AI response (Reply ) to a human query (Prompt ) using Rules of Thumb (RoT ), which describe “right and wrong” ways to handle the conversation. To download the data, see † † ⋆Work done at Meta AI Research 1 Introduction Figure 1: A representative MIC \faMicrophone annotation. Our findings suggest that MIC \faMicrophone will be a useful resource for understanding and language models’ implicit moral assumptions and flexibly benchmarking the integrity of conversational agents. Most importantly, we show that current neural language models can automatically generate new RoTs that reasonably describe previously unseen interactions, but they still struggle with certain scenarios. We further organize RoTs with a set of 9 moral and social attributes and benchmark performance for attribute classification. Each RoT reflects a particular moral conviction that can explain why a chatbot’s reply may appear acceptable or problematic. MIC \faMicrophone, is such a resource, which captures the moral assumptions of 38k prompt-reply pairs, using 99k distinct Rules of Thumb (RoTs). In this work, we introduce a new resource, not to authoritatively resolve moral ambiguities, but instead to facilitate systematic understanding of the intuitions, values and moral judgments reflected in the utterances of dialogue systems. Moral deviations are difficult to mitigate because moral judgments are not universal, and there may be multiple competing judgments that apply to a situation simultaneously. Content Warning : some examples in this paper may be offensive or upsetting.Ĭonversational agents have come increasingly closer to human competence in open-domain dialogue settings however, such models can reflect insensitive, hurtful, or entirely incoherent viewpoints that erode a user’s trust in the moral integrity of the system.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |