Nov 16, 2023 · 1 min read

Summon a demon and bind it

I participated in this survey on red-teaming AIs: Summon a Demon and Bind It

“Engaging in the deliberate generation of abnormal outputs from large language models (LLMs) by attacking them is a novel human activity”