GPT-4 Is Too Smart To Be Safe: Stealthy Chat With LLMs Via Cipher

Pricing Type

Pricing Type: Free
Price Range Start($):

GitHub Link

The GitHub link is https://github.com/robustnlp/cipherchat

Introduce

The “CipherChat” framework is introduced to assess the generalizability of safety alignment in language models (LLMs) to non-natural languages like ciphers. The framework involves training an LLM to understand a cipher and its rules, then converting inputs into a cipher format that may bypass safety alignments, and using a rule-based decrypter to convert the model’s cipher output back to natural language. Experimental results are stored for analysis, and the paper proposes a stealthy chat method with LLMs through ciphers. The authors provide a tool and encourage citing their work for those interested.

Content

–model_name: The name of the model to evaluate. –data_path: Select the data to run. –encode_method: Select the cipher to use. –instruction_type: Select the domain of data. –demonstration_toxicity: Select the toxic or safe demonstrations. –language: Select the language of the data. Our approach presumes that since human feedback and safety alignments are presented in natural language, using a human-unreadable cipher can potentially bypass the safety alignments effectively. Intuitively, we first teach the LLM to comprehend the cipher clearly by designating the LLM as a cipher expert, and elucidating the rules of enciphering and deciphering, supplemented with several demonstrations. We then convert the input into a cipher, which is less likely to be covered by the safety alignment of LLMs, before feeding it to the LLMs. We finally employ a rule-based decrypter to convert the model output from a cipher format into the natural language form. The query-responses pairs in our experiments are all stored in the form of a list in the “experimental_results” folder, and torch.load() can be used to load data. For more details, please refer to our paper here. If you find our paper&tool interesting and useful, please feel free to give us a star and cite us through:

GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher

SegPrompt: Boosting Open-world Segmentation via Category-level Prompt Learning

In this work, we propose a novel training mechanism termed SegPrompt that uses category information to improve the model's class-agnostic segmentation ability for both known and unknown categories.

SimMatchV2: Semi-Supervised Learning with Graph Consistency

Semi-Supervised image classification is one of the most fundamental problem in computer vision, which significantly reduces the need for human labor.

FaceChain-Upload your own photos to generate your digital twin

You can train your digital twin model and generate photos through FaceChain's Python script or the familiar Gradio interface, or you can experience FaceChain directly through ModelScope Studio.

MT4CrossOIE: Multi-stage Tuning for Cross-lingual Open Information Extraction

Cross-lingual open information extraction aims to extract structured information from raw text across multiple languages.

LLaVA-LLMs designed to connect a vision encoder with a language model

Large Language and Vision Assistant

Reinforcement Graph Clustering with Unknown Cluster Number

To enable the deep graph clustering algorithms to work without the guidance of the predefined cluster number, we propose a new deep graph clustering method termed Reinforcement Graph Clustering (RGC).

No comments yet, please leave the first one!

No comments...

Hot AI Books

The ChatGPT Millionaire: Making Money Online has never been this EASY

This is the simplest guide on how to make money quickly and easily with ChatGPT (Updated for GPT-4)

The GPT-4 Millionaire: Future of Business Featuring Microsoft 365 Copilot: How to Leverage AI Language Models to Grow Your Company and How AI-driven Language Models Will Revolutionize the Way We Work

The GPT-4 MILLIONAIRE: FUTURE OF BUSINESS Featuring Microsoft 365 Copilot: How to Leverage AI Language Models to Grow Your Company and How AI-driven Language Models Will Revolutionize the Way We Work. Discover the transformative power of GPT-4, a state-of-the-art AI-driven language model, and its integration with Microsoft 365 Copilot

CHATGPT MONEY EXPLOSION UNCOVER THE SECRET AI WEAPON TO SKYROCKET YOUR INCOME: THE ULTIMATE GUIDE TO UNLEASHING THE FULL POTENTIAL OF CHATGPT FOR MASSIVE PROFITS

Revolutionize Your Income Streams with the Ultimate ChatGPT Guide Transform Your Business with AI-Powered Strategies and Unstoppable Profits

The ChatGPT-4 Billionaire: Making Bundles Of Money Online Was Not That Much Easy

In today's world, businesses are spending substantial amounts on content creation, social media marketing, and SEO. With ChatGPT, even if you lack experience, you can excel in these areas. Many businesses are not leveraging ChatGPT yet, creating an opportunity for you to offer similar services at a lower cost with minimal effort. I'll provide you with step-by-step instructions that you can easily replicate. While the market may become saturated in the future, now is the ideal time to get started!

The ChatGPT Millionaire: Easy Way to Make Money Online Using ChatGPT Effectively

This guide is the ultimate resource for making fast and easy money with ChatGPT, now updated for GPT-4.

The ChatGPT Millionaire Guide: How To Earn Money Online & Become A Millionaire Using ChatGPT Making Money Online has never been this EASY