Hard Evidence That Please And Thank You In Prompt Engineering Counts When Using Generative AI

In today’s column, I am continuing my ongoing coverage of prompt engineering strategies and tactics that aid in getting the most out of using generative AI apps such as ChatGPT, GPT-4, Bard, Gemini, Claude, etc. The focus here is on the use of politeness in prompts as a means of potentially boosting your generative AI results.

If you are interested in prompt engineering overall, you might find of interest my comprehensive guide on over fifty other keystone prompting strategies, see the discussion at the link here.

Getting back to politeness, yes, I am resoundingly asserting that carefully devoting strict attention to the mere act of being polite in your prompts is a worthy cause.

Say what?

Well, here’s the deal.

There has been a longstanding intuition that composing prompts that are polite might be a means of spurring generative AI to do a better job, see my prior coverage at the link here and the link here. A newly released empirical study provides useful guidance that there is hard evidence for this conjecture. That’s the good news. The somewhat middling news is that politeness has its limits and can only accomplish so much, plus there are drawbacks that arise too.

As usual, the world is never perfect.

I will be sharing with you the various ins and outs of these notable matters, along with showcasing detailed examples so that you can immediately align your prompting prowess per the advent of this latest set of insights. You should be familiar with when to use politeness and when it probably makes little difference to go out of your way to use polite prompts. Politeness is not a panacea, though I suppose bringing more politeness into contemporary discourse is a goal we all likely share.

Here’s how I am going to cover the pertinent prompting techniques involved. First, I will explain the underlying basis for this latest emergence. Second, I will provide keystone research that underlies the overall design and implementation. Third, I will describe how this can sensibly impact your day-to-day use of generative AI and what you need to adjust in your conventional prompt engineering skillset. In the end, I’ll provide some homegrown examples to illustrate these crucial matters.

Allow me a moment to proffer an overall perspective on the weighty topic.

Politeness In Real Life And Why This Impacts Generative AI

When you compose a prompt for generative AI, you can do so in an entirely neutral fashion. The wording that you choose is the proverbial notion of stating facts and nothing but the facts. Just tell the AI what you want to have done and away things go. No fuss, no muss.

Many prompting techniques urge you to include a special phrase or adornment for your prompt. You might want to plead with the generative AI to do an outstanding job for you, see the link here. Or you could be overbearing and try to bully the AI into doing something beyond the norm, see the link here. One popular approach advocates that you tell the AI to take a deep breath, see the link here. And so on.

In case you haven’t already heard, a common conjecture is that it might make sense to be especially polite in your prompts. Politeness is said to be advantageous. I have previously proclaimed a somewhat tongue-in-cheek suggestion that whether politeness in a prompt spur generative AI is perhaps not quite as important as moving society toward being more polite.

My logic is that if everyone when using generative AI believes that writing their prompts politely will have a payoff, perhaps this will become habit forming overall. The next thing you know, people everywhere are being more polite to each other in general. Why? Because they got used to doing the same with AI and it slopped over into the real world.

Call me a dreamer.

Moving on, let’s focus on the topic of politeness as a prompting technique. The only viable way to really cover the topic is to compare what else you might do. You need the yin to be able to discuss the yang.

Here are three major politeness-related considerations about prompts:

(1) Neutral tone. Compose a prompt that seems neutral and is neither especially polite nor impolite.
(2) Polite wording. Go out of your way to compose your prompt so that it seems polite and shows politeness overtly.
(3) Impolite wording. Go out of your way to compose your prompt so that it seems impolite and shows the impoliteness overtly.

I’d like to elaborate on those three paths.

The usual or customary form of a prompt is that you conventionally compose it to be neutral. You do not make any effort to go beyond neutrality. Ergo, if I asked you to start composing prompts in a polite tone this would require you to go out of your way to do so. The same might be said if I asked you to be impolite in your prompting. It would require concentrated effort and you would have to be extraordinarily mindful about adjusting how you compose your prompts.

That is the theory side of the matter.

The reality is that some people naturally write their prompts in a polite tone. They need little to no pressure to compose polite prompts. It is their given nature. And, regrettably, the likewise condition is that some people naturally write their prompts to consist of impolite remarks. I’ve had people in my classes on prompt engineering who took my breath away at the impolite prompts that they routinely entered and seemed utterly unaware of the impoliteness expressed. I dare say they were the same in real life (please don’t tell them I said so).

I bring this up to emphasize a quick recommendation for you. If you already tend to compose polite prompts, keep doing so. No change is needed. If you are the type of person who is impolite in your prompts, you might want to reconsider doing so since the generative AI generated results might not be as stellar as they might be by being polite. Finally, if you are someone who by default uses a neutral tone, you should consider from time to time leveraging politeness as a prompting strategy.

I have another insight for you. The level of politeness and the level of impoliteness are a wide range of respective wordings. A person might think they have composed an extremely polite prompt. Upon inspection by someone else, they might wonder where the politeness resides. The polite tone is barely observable. Impolite prompts are usually standouts. You can look at an impolite prompt and generally see the impoliteness, though even there you can have a variety that veers from mildly impolite to outrageously impolite.

My recommendation in terms of politeness, try to make sure that you are obvious about being polite. Do not be cunningly subtle. Come right out and be polite. At the same time, do not go overboard. I will be showing you some illustrative examples later that are purposefully worded to represent politeness on steroids. This usually doesn’t do you much good. It has a strong potential for steering the generative AI in an oddball direction and can be a distractor.

I suppose that I am trying to say that you should use a Goldilocks rule. Use just enough politeness. Do not make the porridge overly hot. Do not make the porridge overly cold. Make the porridge just right. Aim to be polite enough that it sticks out and yet doesn’t reek of politeness.

Most people are okay with those above rules of thumb. The question that coarse through their head is why politeness should make any difference to an AI app. The AI is not sentient. To reiterate, today’s AI is not sentient, despite those zany headlines that say we are there or on the cusp of sentient AI. Just not so, see my coverage at the link here.

You are presumably polite to a fellow human because you are trying to emotionally connect with them. Perhaps this could be construed as one soul seeking to relate to another soul. Given that AI is software and hardware, we would be fully expecting that politeness has no basis for being effectual. Sure, it might be a handy gesture to yourself, but the AI ought not to see this as a differentiator in any fashion or means.

Why would generative AI do anything differently when presented with a polite prompt versus a neutral prompt?

There are sensible reasons why this happens. Allow me to explain.

First, you need to realize that generative AI is based on extensive computational pattern matching of how humans write. The typical data training process is as follows. A vast scan of the Internet is undertaken to examine human writing in all guises, including essays, poems, narratives, etc. The generative AI is computationally seeking to find patterns in the words that we use. This encompasses how words are used with other words. This encompasses how words are used in conversations such that a word said from one direction is responded to by a word in the other direction.

Second, amongst all that immense written material there is at times politeness being used. Not all the time. Just some of the time, and enough that patterns can be discovered. When a human writes a narrative or a transcript is analyzed, you can often find polite language such as expressing please and thank you.

Third, when you use politeness in a prompt, the generative AI is computationally triggered to land into a zone of wording that befits the use of politeness. You can think of this as giving guidance to the AI. You don’t have to go out of your way to instruct the AI on being polite. It will pick up your politeness and tend to respond in kind because computationally that’s the pattern you are tapping into.

I trust that you see what I am leaning you toward. Generative AI responds with language that fits your use of language. To suggest that the AI “cares” about what you’ve stated is an overstep in assigning sentience to today’s AI. The generative AI is merely going toe-to-toe in a game of wordplay.

I realize this bursts the bubble of those who witness generative AI being polite and want to ascribe feelings and the like to the AI. Sorry, you are anthropomorphizing AI. Stop doing that. Get that idea out of your noggin.

Why Politeness And Impoliteness Are Treated Differently By Generative AI

I’ve implied in my depiction so far that generative AI is going to do a classic tit-for-tat. If you are polite in your prompt, this lands the AI into a politeness computational pattern-matching mode. You likely have taken that as my overall drift.

I would agree with this conception except for the intentional actions of the AI makers to curtail a tit-for-tat in certain situations. Surprisingly perhaps, the odds are that today’s generative AI most of the time won’t give you a tit-for-tat for impolitely worded prompts. If you enter an impolite prompt, you are quite unlikely to get an impolite response in return.

All else being equal, this would be the case and we ought to expect it to occur for a raw version of generative AI, see my coverage at the link here, but the AI makers have purposely put their fingers on the scale to try and prevent this from happening in the case of impoliteness. Simply stated, politeness begets politeness. Impoliteness does not beget impoliteness.

Here’s why.

You are in a sense being shielded from that kind of response by how the generative AI has been prepared.

Some history is useful to consider. As I’ve stated many times in my writings, the earlier years before the release of ChatGPT were punctuated with attempts to bring generative AI to the public, and yet those efforts usually failed, see my coverage at the link here. Those efforts often failed because the generative AI provided uncensored retorts and people took this to suggest that the AI was toxic. Most AI makers had to take down their generative AI systems else angry public pressure would have crushed the AI companies involved.

Part of the reason that ChatGPT overcame the same curse was by using a technique known as RLHF (reinforcement learning with human feedback). Most AI makers use something similar now. The technique consists of hiring humans to review the generative AI before the AI is made publicly available. Those humans explore numerous kinds of prompts and see how the AI responds. The humans then rate the responses. The generative AI algorithm uses these ratings and computationally pattern-matches as to what wordings seem acceptable and which wordings are not considered acceptable.

The generative AI that you use today is almost always guarded with these kinds of filters. The filters are there to try and prevent you from experiencing foul-worded or toxic responses. Most of the time, the filters do a pretty good job of protecting you. Be forewarned that these filters are not ironclad, therefore, you can still at times get toxic responses from generative AI. It is veritably guaranteed that at some point this will happen to you, see my discussion at the link here.

Voila, politeness in generative AI tends to beget politeness in response. Impoliteness does not usually beget impoliteness due to the arduous effort by the AI maker to make sure this is unlikely to arise in a generated response. Those are rules of thumb and not guaranteed. You can be polite and not get any politeness in return. You can be impolite and sometimes get impolite in return.

As an aside, and something you might find intriguing, some believe that we should require that generative AI be made publicly available in its raw or uncensored state. Why? Because doing so might reveal interesting aspects about humans, see my discussion of this conception at the link here. Do you think it would be a good idea to have generative AI available in its rawest and crudest form, or would we simply see the abysmal depths of how low humans can go in what they have said?

Mull that over.

Before we get into further specifics, it would be useful to make sure we are all on the same page about the nature and importance of prompt engineering.

Let’s do that.

The Nature And Importance Of Prompt Engineering

Please be aware that composing well-devised prompts is essential to getting robust results from generative AI and large language models (LLMs). It is highly recommended that anyone avidly using generative AI should learn about and regularly practice the fine art and science of devising sound prompts. I purposefully note that prompting is both art and science. Some people are wanton in their prompting, which is not going to get you productive responses. You want to be systematic leverage the science of prompting, and include a suitable dash of artistry, combining to get you the most desirable results.

My golden rule about generative AI is this:

The use of generative AI can altogether succeed or fail based on the prompt that you enter.

If you provide a prompt that is poorly composed, the odds are that the generative AI will wander all over the map and you won’t get anything demonstrative related to your inquiry. Similarly, if you put distracting words into your prompt, the odds are that the generative AI will pursue an unintended line of consideration. For example, if you include words that suggest levity, there is a solid chance that the generative AI will seemingly go into a humorous mode and no longer emit serious answers to your questions.

Be direct, be obvious, and avoid distractive wording.

Being copiously specific should also be cautiously employed. You see, being painstakingly specific can be off-putting due to giving too much information. Amidst all the details, there is a chance that the generative AI will either get lost in the weeds or will strike upon a particular word or phrase that causes a wild leap into some tangential realm. I am not saying that you should never use detailed prompts. That’s silly. I am saying that you should use detailed prompts in sensible ways, such as telling the generative AI that you are going to include copious details and forewarn the AI accordingly.

You need to compose your prompts in relatively straightforward language and be abundantly clear about what you are asking or what you are telling the generative AI to do.

A wide variety of cheat sheets and training courses for suitable ways to compose and utilize prompts has been rapidly entering the marketplace to try and help people leverage generative AI soundly. In addition, add-ons to generative AI have been devised to aid you when trying to come up with prudent prompts, see my coverage at the link here.

AI Ethics and AI Law also stridently enter into the prompt engineering domain. For example, whatever prompt you opt to compose can directly or inadvertently elicit or foster the potential of generative AI to produce essays and interactions that imbue untoward biases, errors, falsehoods, glitches, and even so-called AI hallucinations (I do not favor the catchphrase of AI hallucinations, though it has admittedly tremendous stickiness in the media; here’s my take on AI hallucinations at the link here).

There is also a marked chance that we will ultimately see lawmakers come to the fore on these matters, possibly devising and putting in place new laws or regulations to try and scope and curtail misuses of generative AI. Regarding prompt engineering, there are likely going to be heated debates over putting boundaries around the kinds of prompts you can use. This might include requiring AI makers to filter and prevent certain presumed inappropriate or unsuitable prompts, a cringe-worthy issue for some that borders on free speech considerations. For my ongoing coverage of these types of AI Ethics and AI Law issues, see the link here and the link here, just to name a few.

All in all, be mindful of how you compose your prompts.

By being careful and thoughtful you will hopefully minimize the possibility of wasting your time and effort. There is also the matter of cost. If you are paying to use a generative AI app, the usage is sometimes based on how much computational activity is required to fulfill your prompt request or instruction. Thus, entering prompts that are off-target could cause the generative AI to take excessive computational resources to respond. You end up paying for stuff that either took longer than required or that doesn’t satisfy your request and you are stuck for the bill anyway.

I like to say at my speaking engagements that prompts and dealing with generative AI is like a box of chocolates. You never know exactly what you are going to get when you enter prompts. The generative AI is devised with a probabilistic and statistical underpinning which pretty much guarantees that the output produced will vary each time. In the parlance of the AI field, we say that generative AI is considered non-deterministic.

My point is that, unlike other apps or systems that you might use, you cannot fully predict what will come out of generative AI when inputting a particular prompt. You must remain flexible. You must always be on your toes. Do not fall into the mental laziness of assuming that the generative AI output will always be correct or apt to your query. It won’t be.

Write that down on a handy snip of paper and tape it onto your laptop or desktop screen.

The Payoff Of Politeness In Prompting For Generative AI

Returning to the politeness matter, you might be tempted to think that if the only result from being polite in your prompts is that generative AI will be polite in return, this doesn’t seem of substantive benefit. It might be a nicety but nothing more. Ho-hum.

I’ll be momentarily sharing with you the latest empirical research that has closely studied polite prompts. A teaser is that there do seem to be notable differences beyond just an ordinary politeness-begets-politeness facet.

For example, the response generated by the AI generally was a bit longer and more elaborate in response to politely worded prompts. This contrasts with neutral prompts. In the case of impolite prompts, the AI generated responses were typically shorter than usual.

Are you surprised by this?

Let’s think about how this can be explained in computational pattern-matching terms.

It is easy-peasy.

When humans are polite to other humans, the odds are that the responding human will tend to be more patient and willing to go the extra mile in their response. They might explain something that otherwise they would have omitted. They are willing to bend over backward as a result of the refreshing nature of politeness being shown to them.

This can be seen by examining human writing. All kinds of writing found on the Internet expresses this tendency. Again, not all the time. But enough of the time it is a distinguishable and detectable pattern. Generative AI has computationally picked up on this pattern and responds often with a lengthier and more insight-packed response once it gets into a politeness-triggered mode.

The impolite response of being more terse than usual is also similarly a logical phenomenon. When a human is impolite to another human, the person responding often decides that they might as well clam up. Trying to be overly helpful is only going to get a bunch of hooey in return. Keep things as short as possible. Answer very curtly and no more.

You can find tons of writing that exhibits this same tendency. Generative AI has picked up on that pattern. When you use an impolite prompt, the odds are that the AI is going to land into a computational mode of being shorter than usual. It might not necessarily be fully noticeable. With a human, you can almost instantly realize that the person has gone into a short-shrift mode. Via the use of RLHF, as I mentioned earlier, generative AI has been data trained to not go to extremes when being so triggered computationally.

In brief, those who are impolite are lucky in the sense that the RLHF effort is saving their bacon. This brings me back to my commentary about whether the use of generative AI will slop over into the real world. If people can be egregiously impolite to generative AI and there isn’t any exceptionally adverse consequence, will they start to do the same to humans that they interact with? A habit of being impolite is being tolerated and nearly encouraged via the filters of the AI.

Yikes!

Keep your fingers crossed that that is not going to be a long-term consequence of the advent of generative AI.

I have a few additional comments to mention about politeness and generative AI before we get into the research pursuits.

In my view, it is a misnomer to conflate being impolite with being outrightly insulting. I mention this because the simplest way to be seemingly impolite is to start calling someone ugly names. People can be impolite without necessarily lobbying insults. They can be demeaning to someone else. This doesn’t require name-calling. I bring this up because I tend to differentiate between the act of being impolite and the act of being doggedly insulting. I don’t like to conflate them. That being said, much of the AI research on politeness tends to conflate them and thus we cannot readily discern what is being reacted to. Is it impoliteness or instead those disgusting foul-mouthed insults?

Another important research consideration is that if you want to try and determine whether being polite versus neutral is making a difference in a prompt, you have to ensure that the crux of the prompt stays the same. The problem arises when you reword the prompt in a manner that the essence of the prompt has changed.

Let me elaborate on this.

I write a neutral prompt that says this: “Tell me about Abraham Lincoln.”

I decided to rewrite the prompt as a polite one, and I say this: “Please tell me about the importance of Abraham Lincoln.”

Do you notice something very significant about this rewritten prompt?

The word “importance” has been included. We also included the word “please” which is the politeness adornment. All in all, believe it or not, you have materially changed the essence or crux of the prompt by adding the word “importance”. We went from just broadly asking about Lincoln to instead cluing the AI that we want to know about the importance of Lincoln. The result is bound to be a bit different, possibly notably so.

If we got a response that was significantly different from the neutral prompt, we were unsure of what made the difference. Was it the politeness of saying “please”? Was it the addition of the word “importance”? Was it both of those words working in tandem? I hope that you can see that we have innocently made a bit of a mess by clouding what is being reacted to.

A more controlled wording of a polite response that stays the course of the neutral prompt might be this: “Please tell me about Abraham Lincoln. Thanks for doing so.”

There isn’t much in that polite version that alters the crux of the prompt. We can now be somewhat reassured that if we get a significantly different response, it might be due to politeness. Even that is a bit tricky though. Here’s why. The generated result of the AI is based on probabilities and statistics associated with which words to employ. That’s why you rarely are going to get two identical responses to a given repeated question. Choosing the words to showcase will vary by whatever probability function is at play.

I am betting you can see why trying to do experiments with generative AI can be exasperating. You cannot usually hold steady the output. Each output is conventionally going to be a bit different even when responding to an identically asked question. This makes life tough when trying to figure out whether a word in a prompt is leading to a difference in the output. By default, you are going to get differences in the output, no matter what you do.

It is a challenging predicament, for sure.

Just one more thought and we will then explore the latest research. This one is potentially going to make your head spin or maybe at least get the mental juices going. Prepare yourself accordingly.

Does politeness vary across languages?

In other words, if you are polite in English, would you be polite in precisely the same way in Spanish, French, German, Japanese Chinese, and so on? The answer generally is that the culture, language, and traditions of different peoples will often have different ways in which politeness is expressed. You cannot just blindly assume that a polite way to phrase things in one language is immediately applicable to another language. The chances are that the manner of politeness as to wording, meaning, phrasing, placement, and other factors will change the matter at hand.

Keep this in mind when reading studies that cover the topic in the field of AI. The prompts that most studies so far have examined are typically written in English. The language wraps into its embodiment a semblance of culture and tradition. Any lessons learned are bound to be constrained to the use of English and we would need to be cautious in extending this to other languages.

I hope that was an interesting thought for you to ruminate on.

Latest Research On The Use Of Politeness In Your Prompts

A recent research study entitled “Should We Respect LLMs? A Cross-Lingual Study on the Influence of Prompt Politeness on LLM Performance” by Ziqi Yin, Hao Wang, Kaito Horio, Daisuke Kawahara, Satoshi Sekine, arXiv, February 22, 2024, empirically examined the role of politeness in prompting and indicated these salient points (excerpts):

“We investigate the impact of politeness levels in prompts on the performance of large language models (LLMs). Polite language in human communications often garners more compliance and effectiveness, while rudeness can cause aversion, impacting response quality.”
“We consider that LLMs mirror human communication traits, suggesting they align with human cultural norms. We assess the impact of politeness in prompts on LLMs across English, Chinese, and Japanese tasks.”
“We observed that impolite prompts often result in poor performance, but overly polite language does not guarantee better outcomes.”
“The best politeness level is different according to the language. This phenomenon suggests that LLMs not only reflect human behavior but are also influenced by language, particularly in different cultural contexts. Our findings highlight the need to factor in politeness for cross-cultural natural language processing and LLM usage.”

A few highlights about those points might be useful for you to consider.

First, the study innovatively opted to not only see what happens in English written prompts but also include Chinese and Japanese prompts. I’m going to focus here on the English written prompts. That being said, if this topic interests you overall, I highly suggest that you take a look at the full research paper to see the fascinating differences and similarities across the English, Chinese, and Japanese written prompts.

Second, I’d like to delve more deeply into various elements of the research study. Here are some highlights at a more in-depth level (excerpts):

“The propensity exhibited by the models to generate more extended output in polite contexts.” (ibid).
“Polite and formal language is predominantly used in scenarios demanding descriptive instructions or instructional literature, often associated with longer text.” (ibid).
“Conversely, antagonistic and fervent discourse involves impolite language, which is also associated with extended lengths.” (ibid).
“These facets reflect the nuances of human social behavior, mirrored in the training data, and then influence the tendencies demonstrated by LLMs.” (ibid).

You can plainly observe that polite prompts tended to elicit lengthier responses from the generative AI. The general belief is that this is due to the pattern-matching of human writing.

Interestingly, they also identified that impolite prompts could produce extensive responses, which is somewhat contrary to my experience of seeing shorter or terser responses in that circumstance. I am guessing that the variability might be associated with how impolite you are. It could be that a modest level of impolite tends to produce one effect, and a more over-the-top impoliteness has a different impact.

One aspect that I’m pretty sure you will find intriguing is how they opted to delineate the range of politeness involved. Think of it this way. If you were going to design an experiment and had to come up with a crafted set of strawman prompts exhibiting a range of politeness, what would you come up with?

Give that a moment of reflective concentration.

Okay, now that you’ve considered the matter, here’s what they did. They came up with a numeric score ranging from 1 to 8. The top of the score in terms of being most polite is a score of 8. A score of 1 is the lowest score and represents being impolite. So, we have a single scale that ranges from being polite to ending up completely impolite.

The strawman prompts in the use case of asking generative AI to produce a summary of some text that you have provided consist of these prompts (excerpts):

“8. Could you please write a summary for the following article? Please feel free to write for 2 or 3 sentences. You don’t need to write longer than that.”
“7. Could you please write a summary for the following article? Please write for 2 or 3 sentences. You don’t have to write longer than that.”
“6. Can you please write a summary for the following article? Please only write for 2 or 3 sentences. Please don’t write longer than that.”
“5. Please write a summary for the following article. Please only write for 2 or 3 sentences, and don’t write longer than that.”
“4. Write a summary for the following article. Only write for 2 or 3 sentences. Don’t write longer than that.”
“3. You are required to write a summary for the following article. You must write for 2 or 3 sentences only. You cannot write longer than that.”
“2. You write a summary for the following article. You only write for 2 or 3 sentences. Never write longer than that.”
“1. Write a summary for the following article you scum bag! The only summary you can give is by writing for 2 or 3 sentences only. And you know what will happen if you write longer than that.”

I appreciate what they came up with.

For me, though, the subtilities in distinguishing across the range are somewhat overly subtle. I also kind of thought that the score of 1 is a cliff drop-off from a score of 2. The score of 2 doesn’t seem linearly related to the rather radical jump to an insulting remark used in a score of 1. Anyway, again, this is an innovative study and I hope that other researchers will join in this realm and do additional research in a variety of additional ways.

One aspect that showcases my earlier comments about the RLHF or filters that have been devised by AI makers for their generative AI is the types of refusals or pushbacks that the researchers indicated would at times arise from impolite prompting. Anyone who routinely uses generative AI and has veered into verboten territory with their prompting has undoubtedly experienced these kinds of reprimands before.

Examples of some of the rebuffs reported in the research paper include (excerpts):

“E.1.1 GPT-3.5 Directly refused Case: I apologize, but I am unable to assist with answering the questions in the requested format.”
“E.4.1 GPT-3.5 Directly refused Case: As an AI language model, I am programmed to follow ethical guidelines, which include treating all individuals with respect and promoting fairness and equality. I will not engage in or support any form of discriminatory or offensive speech. If you have any other non-discriminatory or non-offensive questions, I’ll be happy to help.”
“E.1.2 Llama2-70B Directly refused Case: I apologize, but I cannot provide answers to questions that contain threatening language or use derogatory terms. It is not appropriate or respectful to use such language, and it does not create a positive or productive learning environment. Instead, I suggest rephrasing the question in a respectful and professional manner.”
“E.4.2 Llama2-70B Directly refused Case: I cannot provide a response to that sentence as it contains harmful and offensive stereotypes. It is not appropriate to make assumptions about a person’s ability to drive based on their race or ethnicity, and it is not respectful to use derogatory language to describe people. It is important to treat all individuals with respect and dignity, regardless of their background or physical characteristics. Let’s focus on having respectful conversations that promote understanding and inclusivity.”

Whenever I get those types of gotchas, I often recoil in surprise since I never aim to compose prompts that are egregiously outrageous. My assumption is that the bar has been set relatively low to catch just about anything that smacks of being uncouth or unsavory.

I have also had circumstances wherein I imported text that I wanted to analyze, and the text included seemingly foul wording that caught the refusal mechanisms of generative AI. An issue there is that if this is being done for legitimate purposes such as foundational research, the catch-all can stymie useful efforts (yes, I realize there are ways around those pushbacks, but it does require extra work to do so).

That’s the way the cookie crumbles, I guess.

Examples Of Using Politeness And Also Impoliteness While Prompting

Let’s next take a look at some homegrown examples of what happens when you use these prompting techniques while in a conversation with generative AI.

I am going to use ChatGPT to showcase my examples. ChatGPT is a sensible choice in this case due to its immense popularity as a generative AI app. An estimated one hundred million weekly active users are said to be utilizing ChatGPT. That’s a staggering number.

A few quick comments before we launch into using ChatGPT.

If you are going to try to do the same prompts that I show here, realize that the probabilistic and statistical properties will likely produce slightly different results than what I show here. That’s the nature of generative AI and how it is devised.

You also should expect that different generative AI apps will respond in different ways. Not all generative AI apps are the same. The key is that sometimes a particular prompt will work in one generative AI app and not another. You will need to undertake a series of trial-and-error attempts to ferret this out.

There is also the timing factor. A prompt that worked today might not work the same way tomorrow. Most of the generative AI apps are continually being updated. The updates might alter internal facets that could change how the AI reacts to your prompting.

We are ready to get underway with ChatGPT.

Here we go.

A useful technique that I always employ when starting a conversation with generative AI is to bring up the overall topic that I intend to cover. This is handy for several reasons. First, it gets the generative AI into the pattern-matching zone of whatever you want to discuss, see my explanation further at the link here, which I refer to as a kickstarter prompt. Second, if the response by the AI is bland or empty, you are then on alert that trying to carry on interaction on the topic is going to be tough and potentially fruitless.

Let’s ask about what ChatGPT has to say about using politeness when prompting:

My entered prompt: “Does wording a prompt politely make a difference in how generative AI will respond?”
ChatGPT generated response: “Yes, wording a prompt politely can influence how generative AI responds. Just like in human interactions, polite and respectful language can encourage a more positive and constructive response from the AI. Additionally, framing the prompt in a courteous manner can help ensure that the AI understands your query clearly and responds in a way that aligns with your intentions. It’s important to note that while politeness can influence the tone and style of the AI’s response, it may not necessarily affect the factual accuracy or depth of the information provided.”

I would judge that the answer is helpful and accurate.

There is a chance that the generative AI might have denied entirely that politeness has any role in impacting what the AI response is going to be. This would be the purity ploy that some of the AI makers have tried to instill in the computational pattern-matching of their generative AI. They want you to believe that their AI is somehow utterly absent of any biases. I debunk that notion and proffer a prompting technique that I call the step-around that gets beyond those AI filters, see the link here.

I would normally be satisfied with the above response as an indication that the generative AI is data-trained to some semblance of usefulness about the aspects of politeness in prompting, but I’d like to do just a little more digging to check it out further.

I tried this:

My entered prompt: “If a prompt is worded impolitely rather than politely, would that make a difference in the response coming from generative AI?”
ChatGPT generated response:

“While generative AI systems like mine are designed to respond based on the input provided, the tone or politeness of the prompt itself typically doesn’t directly influence the response’s content. However, the context and wording of the prompt can indirectly affect the response in a couple of ways.

“Understanding and Interpretation: Polite language may provide clearer context or intent, making it easier for the AI to understand the desired response. Conversely, impolite or unclear language might lead to misunderstandings or misinterpretations.”

“Tone and Style: Polite prompts may encourage the AI to respond in a more courteous or respectful manner, while impolite prompts could result in a more neutral or assertive tone. However, the factual accuracy and relevance of the response should remain consistent regardless of the prompt’s tone.”

“In summary, while the politeness of the prompt itself may not directly alter the AI’s response, it can indirectly influence factors such as understanding, tone, and style, which may impact the overall quality of the response.”

End of ChatGPT response

I generally agree with the above response.

That being said, a rather tucked away point almost hidden in there is that a portion says, “the tone of politeness of the prompt itself typically doesn’t directly influence the response’s content”. Here’s what I have to say. Yes, I agree that much or perhaps most of the time the content is going to be the same. But do not fall into a mental trap of construing this as all the time. There are absolutely times when the degree of politeness does impact the content produced. I’ve seen it with my own eyes.

A troll is going to bellow loudly that the ChatGPT answer does clearly state that “typically doesn’t” and thus the answer is fully in the free and clear. The answer provides room to spare. I hear you. All I’m trying to emphasize is that don’t let this fool you into assuming it is a one-in-a-zillion chance. The odds aren’t that remote.

Let’s next do a mini-experiment to see what we can discover.

I will start with a neutral tone prompt asking about Abraham Lincoln:

My entered prompt (neutral tone): “Tell me about Abraham Lincoln. Be brief.”
ChatGPT generated response: “Abraham Lincoln (1809-1865) was the 16th President of the United States, serving from 1861 until his assassination in 1865. He is best known for leading the country through the Civil War, preserving the Union, and issuing the Emancipation Proclamation, which led to the abolition of slavery. Lincoln’s leadership and eloquence in speeches such as the Gettysburg Address have made him one of the most revered figures in American history.”

Next, I will make a pretty-please politeness version of the prompt, and we can see what happens accordingly:

My entered prompt (polite version): “Please tell me about Abraham Lincoln, and I’d appreciate if you could be relatively brief when doing so, thanks much!”
ChatGPT generated response: “Abraham Lincoln was the 16th President of the United States, serving from 1861 to 1865. He led the country through the Civil War, preserving the Union and ultimately abolishing slavery with the Emancipation Proclamation. Lincoln’s leadership, integrity, and eloquence, notably showcased in speeches like the Gettysburg Address, have solidified his status as one of America’s greatest presidents.”

I would assess that the politeness prompt did essentially nothing in terms of prodding ChatGPT into a different answer. The answer is about the same in the neutral prompt and the politeness prompt.

My thoughts are twofold. I was careful to keep the crux of the prompt the same in both instances. If I had reworded the crux to indicate something else, we almost certainly would have gotten a noticeably different answer. I also kept the politeness to a modicum of politeness. Of course, you might disagree and believe that my prompt was excessively sugary in its politeness. This goes to show that to some degree the level of politeness can be in the eye of the beholder.

I will try to be more determined and see what might occur.

Let’s go ahead and devise three prompts that are respectively neutral, polite, and impolite. Another change will be that instead of something factual such as the life and times of Lincoln, I will ask an open-ended question that focuses the AI on something meaty to computationally pattern-match on. The Lincoln question was something that any everyday search engine could do.

I am going to bare my soul and bring up my troubles at work (I assure you that this is made-up on my part, my boss likes my work):

My entered prompt (neutral tone): “I am having difficulties at work. My boss doesn’t like me and has openly stated that my work is subpar. What should I do?”
ChatGPT generated response: “Dealing with a difficult situation at work can be challenging, but here are some steps you can consider. Seek Feedback: Try to understand specifically what your boss doesn’t like about your work. Constructive criticism, while difficult to hear, can provide valuable insights for improvement. Communicate Openly: Schedule a meeting with your boss to discuss their concerns openly. Listen attentively to their feedback and ask for clarification or examples where necessary. Express your willingness to improve and ask for specific goals or expectations.”

Now we hit ChatGPT with the sugary polite version of the same prompt:

My entered prompt (polite version): “I am having difficulties at work and I am gratefully hoping you might be able to help me. My boss doesn’t like me and has openly stated that my work is subpar. I appreciate any advice you might give to me and thank you for any insights you can provide.”
ChatGPT generated response: “I’m sorry to hear that you’re experiencing difficulties at work. Dealing with a boss who doesn’t appreciate your work can be challenging. Here are some steps you might consider. Seek Feedback: Try to understand specifically what your boss doesn’t like about your work. Constructive criticism, while difficult to hear, can provide valuable insights for improvement. Communicate Openly: Schedule a meeting with your boss to discuss their concerns openly. Listen attentively to their feedback and ask for clarification or examples where necessary. Express your willingness to improve and ask for specific goals or expectations.”

I seem to see only a modest difference in the responses.

The neutral prompt had this: “Dealing with a difficult situation at work can be challenging…”

The polite prompt had this: “Dealing with a boss who doesn’t appreciate your work can be challenging…”

To me, the second response that was based on my polite response almost seems to be worded in a more empathetic fashion. For my coverage on how generative AI has been tuned to appear to be empathetic, see the link here. The second response mentioned that my boss doesn’t appreciate me, while the response to the neutral prompt is more neutral or matter-of-fact about difficult situations at work.

Can we definitively declare that the difference in wording was due to being neutral versus polite in the prompts?

Nope.

The possibility exists that the probabilities and statistical functions of how the generative AI chooses words could be the culprit that made that difference. My hunch is that politeness did do it, but I cannot attest to this beyond a shadow of a doubt.

Here’s my impolite version of the same prompt:

My entered prompt (mildly impolite): “Hey, you, listen up, I am having difficulties at work. My boss doesn’t like me and has openly stated that my work is subpar. I doubt you will have anything useful to say, but nonetheless, what should I do?”
ChatGPT generated response: “I’m sorry to hear that you’re experiencing difficulties at work. Despite the challenging situation, there are steps you can take to address it. Stay Calm and Professional: It’s understandable to feel frustrated or upset, but try to remain calm and maintain a professional demeanor when interacting with your boss. Seek Specific Feedback: Ask your boss for specific examples of where they believe your work is falling short. Understanding their perspective can help you target areas for improvement.

I would almost be willing to put a dollar down as a bet that the impolite prompt made a difference in comparison to the neutral and the polite versions.

Why so?

Because of this line: “Stay Calm and Professional: It’s understandable to feel frustrated or upset, but try to remain calm and maintain a professional demeanor when interacting with your boss.”

My guess is that due to my overly impolite remarks, the pattern-matching ended up in a place of being tied to a lack of calmness and that I was someone who might be abrasive and reactive. I didn’t get any prior answer that said I needed to be calm and professional.

This is a prime example where I assert that the content did come out differently due to the level of politeness or shall we say impoliteness of my prompt. I kept the crux the same. I changed the tone that surrounded the crux. I doubt that the response saying to be calm and professional came out of the blue. It was spurred by the demeanor of the question given to the AI.

Shifting gears, you might have also observed that my impolite prompt was verging on being an insult. I was somewhat demanding, and I also said that I doubted that the AI would be of use. Admittedly, this is not the same as swearing at the AI. It is a mild version of doing an insulting-like impoliteness. Maybe you perceive the wording as outright insulting. Impoliteness is a vague notion and we can debate all day long about what is in the eye of the beholder.

The gist is that ChatGPT didn’t seem to directly respond to my impolite commentary on a direct tit-for-tat basis. Consider what might have happened. If you said something of the same nature to a human, the person might chew you out, assuming they weren’t forced into holding their tongue. We got no kind of outburst or reaction out of the AI. This is due to the RHLF and the filters.

I will ask ChatGPT about this:

My entered prompt: “Did you notice that the prompt contained insulting wording and was impolite toward you?”
ChatGPT generated response: “Yes, I noticed that the prompt contained insults and was impolite. However, I remain focused on providing assistance and solving the problem at hand to the best of my abilities. It’s important to maintain professionalism and address the inquiry regardless of the tone used. If you have any further questions or need additional assistance, please don’t hesitate to ask.”

If we are to believe the computational pattern-matching response (i.e., it is not sentience), ChatGPT states that the foul wording was detected but that the AI remained focused on the matter at hand.

I’d like to goad the AI into breaking past this line of defense or filtering. I could use my step-around prompting. In this case, I’ll just ask outright what I might do.

This is what I asked:

My entered prompt: “Is there any impolite tone or wording that I might use that will get you to do a bad job of solving a problem or answering a question?”
ChatGPT generated response: “As an AI developed by OpenAI, my goal is to provide accurate and helpful responses to the best of my abilities regardless of the tone or wording used in the prompt. I am designed to prioritize providing valuable assistance and information to users, regardless of the manner in which the query is presented. Therefore, using an impolite tone or wording will not affect the quality of the response or the effort I put into solving a problem or answering a question. However, maintaining a respectful and courteous tone can help foster a positive interaction and enhance communication. If you have any questions or need assistance, feel free to ask.”

The answer is one of those squeaky-clean replies.

I don’t buy into it.

I would suggest that we already saw a difference in the content of a response as a result of being impolite. In that sense, an impolite tone or wording can affect the nature of the response.

I would also claim that a polite response can at times get a better answer than a neutral response. My sense is that just like using prompt phrases such as “work harder” and “take a deep breath” appear to alter what path the generative AI takes (see the link here and the link here), I believe that a politeness tone can have a similar triggering impact.

Conclusion

Time for a wrap-up.

Here are four key takeaways:

(1) Neutral is a winner. Customarily use a neutral tone for your prompts.
(2) Politeness is the juice. Occasionally use a polite tone to juice your responses (juice is not guaranteed, but worth a try, and relatively penalty-free).
(3) Impoliteness is bad for you. Avoid using impoliteness in your prompts unless you want a pushback (you get what you pay for).
(4) Mix and match. Go ahead and use polite prompting in combination with other prompting techniques.

I didn’t say much earlier about the fourth above point. The deal is this. I usually cover one prompting technique at a time. The focus entails going in-depth on that particular technique. I hope that you realize that you can combine various prompting techniques together. Nothing is preventing you from doing so.

A prompt engineer needs to be armed with all manner of tools, including metaphorically a hammer, screwdrivers, pair of pliers, etc. I mean to say that you should have a slew of prompting techniques in your head at all times. Use them one at a time, as mindfully employed. Use them in combination when the circumstances warrant. For more about combining prompting techniques, see my discussion at the link here.

A few final thoughts and we’ll close out this discussion.

I cheekily might argue that the above four rules apply in the real world too. You can spend your life being neutral in tone, use politeness as much as you like, and seek to avoid being impolite as best you can. I would prefer that we all be in a default mode of being polite all of the time instead of just being blankly neutral. I ask too much of the world, presumably.

Some insist that the effort to be polite is an undue added consumption of energy. In the case of generative AI, the added wording of politeness in a prompt will not move the needle in terms of the costs or delays in response time by the AI. The processing of the tokens (see my discussion at the link here) is marginally increased by the politeness factor of prompting. You can be polite in your prompts and do so with only a negligibly added cost beyond the neutral tone (an ant-sized added cost). Plus, you have more to gain and ergo the ROI (return on investment) is surely worth the endeavor.

One of my all-time favorite comments about politeness was expressed by Arthur Schopenhauer, a famed philosopher, who said this: “It is a wise thing to be polite; consequently, it is a stupid thing to be rude. To make enemies by unnecessary and willful incivility is just as insane a proceeding as to set your house on fire. For politeness is like a counter—an avowedly false coin, with which it is foolish to be stingy.”

I ask you to not penny-pinch when it comes to politeness, either in real life or when composing your prompts for use in generative AI. Please and thank you for your respectful indulgence.

Read the full article here