Understanding Multi-Turn Toxic Behaviors in Open-Domain Chatbots. (arXiv:2307.09579v1 [cs.CR]) | allinfosecnews.com

July 20, 2023, 1:10 a.m. | Bocheng Chen, Guangjing Wang, Hanqing Guo, Yuanda Wang, Qiben Yan

cs.CR updates on arXiv.org arxiv.org

Recent advances in natural language processing and machine learning have led
to the development of chatbot models, such as ChatGPT, that can engage in
conversational dialogue with human users. However, the ability of these models
to generate toxic or harmful responses during a non-toxic multi-turn
conversation remains an open research question. Existing research focuses on
single-turn sentence testing, while we find that 82\% of the individual
non-toxic sentences that elicit toxic behaviors in a conversation are
considered safe by existing …

chatbot chatbots chatgpt conversation development domain human language led machine machine learning natural language natural language processing non question research toxic turn understanding

More from arxiv.org / cs.CR updates on arXiv.org

Why You Should Not Trust Interpretations in Machine Learning: Adversarial Attacks on Partial Dependence Plots 5 hours ago | arxiv.org

adoption adversarial adversarial attacks artificial +22

Sui Lutris: A Blockchain Combining Broadcast and Consensus 5 hours ago | arxiv.org

agreement arxiv blockchain broadcast +12

Jolteon and Ditto: Network-Adaptive Efficient Consensus with Asynchronous Fallback 5 hours ago | arxiv.org

arxiv asynchronous blockchains clear +19

Noisy Measurements Are Important, the Design of Census Products Is Much More Important 5 hours ago | arxiv.org

arxiv asking august call +19

Graphene: Infrastructure Security Posture Analysis with AI-generated Attack Graphs 5 hours ago | arxiv.org

analysis arxiv assessment attack +31

REED: Chiplet-Based Accelerator for Fully Homomorphic Encryption 5 hours ago | arxiv.org

accelerator accelerators address application +14

Evaluation Methodologies in Software Protection Research 5 hours ago | arxiv.org

arms arxiv assets attackers +20

SoK: Rowhammer on Commodity Operating Systems 5 hours ago | arxiv.org

academia access arxiv attacks +17

Quantum cryptographic protocols with dual messaging system via 2D alternate quantum walks and genuine single … 5 hours ago | arxiv.org

alternate arxiv can cond-mat.dis-nn +17

Social Engineer For Reverse Engineering Exploit Study

@ Independent study | Remote

View on infosec-jobs.com

Application Security Engineer - Remote Friendly

@ Unit21 | San Francisco,CA; New York City; Remote USA;

View on infosec-jobs.com

Cloud Security Specialist

@ AppsFlyer | Herzliya

View on infosec-jobs.com

Malware Analysis Engineer - Canberra, Australia

@ Apple | Canberra, Australian Capital Territory, Australia

View on infosec-jobs.com

Product CISO

@ Fortinet | Sunnyvale, CA, United States

View on infosec-jobs.com

Manager, Security Engineering

@ Thrive | United States - Remote

View on infosec-jobs.com