GPT-4 IS TOO SMART TO BE SAFE: STEALTHY CHAT WITH LLMS VIA CIPHER

Click here to download project base paper.

Abstract:

Safety lies at the core of the development of Large Language Models (LLMs).
There is ample work on aligning LLMs with human ethics and preferences, including data filtering in pretraining, supervised fine-tuning, reinforcement learning from
human feedback, and red teaming, etc. In this study, we discover that chat in cipher
can bypass the safety alignment techniques of LLMs, which are mainly conducted
in natural languages. We propose a novel framework Cipher Chat to systematically
examine the generalizability of safety alignment to non-natural languages – ciphers.
Cipher Chat enables humans to chat with LLMs through cipher prompts topped
with system role descriptions and few-shot enciphered demonstrations. We use
Cipher Chat to assess state-of-the-art LLMs, including Chat GPT and GPT-4 for different representative human ciphers across 11 safety domains in both English and
Chinese. Experimental results show that certain ciphers succeed almost 100% of
the time to bypass the safety alignment of GPT-4 in several safety domains, demonstrating the necessity of developing safety alignment for non-natural languages.
Notably, we identify that LLMs seem to have a “secret cipher”, and propose a
novel Self Cipher that uses only role play and several demonstrations in natural
language to evoke this capability. Self Cipher surprisingly outperforms existing
human ciphers in almost all cases.

Real-time Energy Efficiency Monitoring Using IoT

Connected Industrial Monitoring Systems Using IoT

Smart Urban Traffic Solutions with IoT Integration

IoT-Based Smart Agriculture and Farming Solutions

Smart Urban Traffic Management with IoT Integration

IoT-Based Smart Energy Metering Solutions

Comments

Leave a Reply Cancel reply

Top Artificial Intelligence Projects for Students- Innovative AI Solutions and Ideas for Final Year

Android Project Ideas- Innovative Android Projects for Final Year Students

Bus tracker Android Application

E-Commerce Application for Mobile