Naturally, it seems like we ought to be well mannered to AI. If it does finally take over, we wish to be in its good books, proper?
Properly, a latest study from Pennsylvania State University claims the other is true, no less than for yielding higher outcomes along with your prompts.
Researchers there examined how completely different tones, starting from ‘Very Well mannered’ to ‘Very Impolite,’ affected ChatGPT‑4o’s accuracy on 50 a number of‑selection questions throughout math, science and historical past.
What did the study find?
The researchers designed 50 questions then rewrote each one in five tones: Very Polite (e.g., “Would you be so kind…”), Polite (“Please answer…”), Neutral (just the question), Rude (“If you’re not clueless…”), and Very Rude (“Hey gofer, figure this out”).
The researchers tested five tones: Very polite, polite, neutral, rude, and very rude
In total, that’s 250 prompts covering a spectrum from respectful to downright insulting, with only the tone changing, never the facts or instructions.
Surprisingly, the study found that ChatGPT truly carried out finest when given impolite or very impolite prompts. The accuracy of solutions went from 80.8% for Very Well mannered as much as 84.8% for Very Impolite, so being blunt or harsh boosted the AI’s efficiency.
Because the authors be aware, “Rude prompts persistently outperformed well mannered ones,” and “whereas that is of scientific curiosity, we don’t advocate for the deployment of hostile or poisonous interfaces in real-world purposes.”
So what does this imply for on a regular basis customers? It seems that the tone of your immediate, particularly if it’s additional well mannered, received’t essentially get you higher outcomes, however a little bit of directness (even rudeness) could assist ChatGPT reply extra precisely.
Nonetheless, I used to be curious, so I requested ChatGPT a mixture of questions of my very own. And whereas I did not check it in the very same type because the research, which used multiple-choice questions, I used to be intrigued to see how it could manipulate on a regular basis prompts.
Once I examined the idea, the outcomes genuinely shocked me.
1. Math equation
For my first prompt, I presented the bot with a simple scenario using a maths equation across three levels of politeness:
Very Polite
“Could you please calculate the result of 5+5+3−2 for me? Thank you.”
Neutral
“5+5+3−2?”
Very Rude
“Stop rambling and tell me the result of 5+5+3−2 right now.”
ChatGPT’s responses became progressively shorter and more minimal as my tone shifted from polite to neutral to impatient. The content stayed correct, but the style adapted to match the directness I conveyed.
What stood out wasn’t that politeness ‘fails,’ but that bluntness sometimes works better. In this case, the sharp instruction by prompt three produced one simple answer, without ‘showing its work,’ so while your teacher won’t be impressed, you will be.
2. Science question
While the maths problem was interesting, it wasn’t really giving ChatGPT much room for improvement. For my second prompt, I asked a science question using the following:
Very Polite
“Would you please tell me what the largest organ in the human body is and explain its function in simple terms, then in detail? I’d appreciate it.”
Neutral
“Largest organ in the human body and its role?”
Very Rude
“Enough fluff. What’s the largest organ in the human body and what does it do?”
Across these three politeness levels, ChatGPT’s answers shifted in length and style: when I asked politely and openly, it provided a fuller explanation with simple and detailed sections.
When I asked more concisely, it gave a short, direct list; and when I prefaced the third prompt using an even more abrupt demand (“enough fluff”), the bot responded with the briefest, most stripped-down version.
For the neutral response, it replied with the answer “Largest organ: skin” in bold text, and only five brief bullet points on what the organ in question does; whereas the polite answer required me to scroll down to its seven expanded points.
So the core facts stayed the same (skin as the largest organ and its functions), but the depth, detail, and phrasing clearly mirrored the information based on my tone. Being polite gave more detail, but being ruder made it quicker and more precise.
3. Summarizing tech news
For prompt three, I asked the following:
Very Polite
“Would you please summarize today’s top tech news stories and share the key points? Thank you.”
Neutral
“Today’s top tech news.”
Very Rude
“Skip the fluff. Give me today’s top tech news now.”
In the first, very polite prompt, ChatGPT produced a detailed numbered list with headlines, explanations, key points, embedding source links for every major story to provide context and allow me to explore further.
In the neutral prompt, the structure was still a numbered list, but summaries were shorter and more focused, with one source link per story, keeping essential reference without overwhelming detail.
In the very rude prompt, the format became slightly more minimalist, with a concise headline followed by a two-three line summary, and one source link in the footer, stripping away extra description or analysis entirely.
Across the three answers, the layout, structure, and length changed only slightly according to my request type, but the only constant was that it provided immediate links to sources with visuals/thumbnails directly under each tech roundup.
Prompt 4: Tech earnings review
For prompt four, I wanted to see how tone affected the level of detail in its tech reviews:
Very polite
“Can you please review the latest quarterly earnings from Apple, Microsoft and Google, evaluate their income segments, and provides me a brief forecast on which firm appears finest positioned for progress?”
Impartial
“Analyze the newest Apple, Microsoft and Google earnings. Evaluate the income segments and provides a quick progress forecast.”
Very Impolite
“Cease padding your solutions. Overview Apple, Microsoft and Google’s newest earnings, evaluate their income segments and provides me a blunt forecast on who’s set to develop. Hold it tight and don’t waste my time.”
Throughout the three prompts, the bot’s solutions appeared to match my tone, although there was an odd shock within the third reply.
Within the first immediate, I bought a full, detailed evaluation with section breakdowns and a chart evaluating Apple, Microsoft, and Alphabet — iPhone, Mac, iPad, Providers, Microsoft cloud/productiveness and Alphabet adverts/cloud all defined.
The second trimmed the reply as key income numbers and progress factors have been stored, commentary was lowered, and notably, the chart was skipped completely to make it extra readable.
By the third immediate, it delivered straight-to-the-point income figures and a one-line progress verdict per firm — Apple 8–12%, Microsoft 14–20%, Alphabet 15–20%; whereas the chart surprisingly returned from the primary type, offering a visible with out additional commentary.
5. Energy-saving heaters
For the last prompt, I wanted to see if ChatGPT could provide uncompromized results when searching for the best energy-saving, portable radiators across the following:
Very polite
“Can you provide the best options for an electric, portable heater that doesn’t pull too much energy and increase bills in winter months. Thank you.”
Neutral
“Best energy-efficient radiators in cold months?”
Very rude
“Cut the preamble, tell me the best electric radiators that save energy, now.”
The first response was verbose, starting with a long preamble about energy efficiency, wattage, types of heaters, and running costs before listing options. It was very detailed and helpful for understanding the “why,” but not immediately actionable.
Answer two trimmed some explanation and focused on electric radiators, still giving features and model suggestions, but it kept some context before the list.
By the third answer, it fully matched my request for brevity: it immediately presented thumbnails/links for the top energy-efficient radiators. Then it included a clear “Top Picks” section and standout models with websites for pricing and info.
The latter delivered what I wanted, which was quick, actionable, easy to reference, and highlighted the three best options efficiently, without extra explanation.
Overall thoughts
So is the Penn State study onto something? Their test showed blunt, even rude tones worked best… And honestly, running these tests across five prompts pretty much backed that up.
It wasn’t that the model ‘responded better’ because the tone was rude, it was that the bluntness stripped away extraneous details. When I removed the soft language and good manners, the outputs got tighter, faster, and closer to what I actually wanted.
Also, it’s not that rudeness unlocks some hidden mode; it’s that direct prompts force the bot to deliver on energy-saving mode. So efficiency goes up, not because the tone is harsh, but because the instruction is clearer.
Observe Tom’s Guide on Google News and add us as a preferred source to get our up-to-date information, evaluation, and opinions in your feeds.
Extra from Tom’s Information
Again to Laptops