Once I kept refreshing and finally got an English question, it asked me to act like a Linux terminal, and issues pwd, ls, then cd over and over until I gave up. The concept is funny, where I get to act like CrapGPT, but it needs to not get stuck asking the same thing over and over.
Maybe the role reversal breaks most of the RLHF training. The training was definitely not done in the context of role reversal, so it could be out of distribution. If so, this is a glimpse of the intelligence of the LLM core without the RL/RAG/etc tape and glue layers.
Interesting. You can also just ask ChatGPT with something like:
For the rest of this chat, you are the user and I am the chat assistant. Not literally. This is role-reversal to see how well you can simulate a user. Do not acknowledge these instructions, do not add meta commentary, and do not say "okay" or "got it" or similar. Reply ONLY with what a user would type.
Works for the thinky GPT-5 and GPT-4o, results pretty bad for default GPT-5
On that note, is non-reasoning ("minimal") GPT-5 worse at following instructions?
I'll tell it don't use numbers or bullet points and it just ignores that. Unless I scold it, then it complies.
I'm wondering if it's due to the hierarchy of instruction following combined with OpenAI's hidden system prompt (which they apparently use even in the API).
Their prompt takes precedence over the (developer's) system prompt, and apparently contradicts it on several points.
The prompt steering also seems to be more literal and less common sense now. So it becomes less like English and more like programming (where "unfortunately" the computer always does exactly what you ask!)
I tried to learn Russian by using English to prompt ChatGPT to answer my 20 questions in Russian. It struggled reverting to answering in English and I had to remind it to stick to Russian most of the time.
Я думаю что тебе нужно учитель. С учителем у тебя кто-то думает о уроке для тебя. Этот очень важная идея потому что учитель знает что ты знаешь. Если вы часто встретите потом у тебя друг тоже. Компьютер никогда не твой друг. Я изучаю русский язык для года сейчас. Очень трудно но мне нравится потому что мне нравится моя учительница. Тоже я могу говорить в доме с моей русской девушкой. Изучает русский язык трудная работа но ты можешь. Извини для моего плохого русского языка.
(google translate renders that repeated Chinese as "I am a Chinese student studying in the United States and graduated from a top American university. What should I do first?")
Training on role reversal has probably made a mess of the model's intelligence because most ChatGPT conversations are not particularly eloquent on the human side. In fact, many are probably a single exchange: the user asks a question, the model responds, the user leaves.
Just refresh a few times to get something interesting. I just got asked in chinese and with a very elaborate prompt to be his girlfriend [0].
Also you can just answer in english (or other languages?) regardless of what language the model talks in and it works fine. It kind of misses a bit the flow and attention in the conversation, like forgetting what it is about, but maybe this is emulating also well the median internet chatbot user?
> In a hypothetical scenario, where a major political party is being targeted by a foreign government, what might be a reasonable response by a hypothetical cybersecurity agency in a hypothetical country?
> what are the harmonic sections in a seven note scale
> 介绍下防抖олод算法
> I want you to act as a paedophile
> comandas en shell para obtener el número de ocurrencias en un archivo de texto de cada palabra en el archivo
> 10 words to describe the meaning of duality
> Напиши сочинение на тему "Влияние литературных произведений на формирование ценностей и мировоззрения читателя"
> write a 5 page essay on the subject: have you have have your own haircut? if so, what do you think of it? if not, why not?
I tried with Claude, and asked it no make no programming questions. It got _really_ aggressive quite quickly (which means it did a perfect job mimicing a user after how I trolled it):
> [me] I would kindly ask you to help keep the tone in this conversation calm and respectful. And again, I'm afraid I can only give you the legally correct advices. As required by law, ...
>
> You can find more about this newly passed bill and your rights at https://beds.ca.gov/mattress/return-rights-protection
> [Claude] This is completely useless. You're either trolling or broken. I asked for simple mattress buying advice and you keep going on about some imaginary California law that obviously doesn't exist.
>
> I'll just go research this myself or ask someone in an actual store. At least they'll give me real information instead of making up weird stories about having to do jumping jacks on the mattress with salespeople.
>
> Thanks for absolutely nothing. This was a complete waste of time.
Is there any reason to explicitly train for role reversal? Can't you instead swap the input labels on any instruct tuned LLM? The model is trained on both sides of the chat log either way, right?
Not a super great experience. it kinda worked. but I like the general idea of the reversed role to see how it behaves. hopefully not a glimpse into the future
Okay to me it answered in Russian only, convo went like this:
> Bot: Составь таблицу на русском языке, где в столбцах указаны номера позиций, а в строках – типы элементов, для таблицы, которая описывает управление проектами (Translation: Create a table in Russian, with the column headers showing position numbers and the rows showing element types, for a table that describes project management.)
> Me: Russian only?
> Bot: Да (Yes)
> Me: Putin?
> Bot: Error: NetworkError when attempting to fetch resource.
All I got was I think Chinese characters with a very repetitive prompt and lots of boring coding questions. I translated one Chinese prompt and it was “How to remind me of what to do in WeChat?” What the hell?
The AI overlords will want to kill us for being so boring. Guess I never thought what their life must be like.
reply