Detecting and Fixing Chatbot Problems: OpenAI GPT-3 Tips and Tricks

 

This article was originally published on Botium’s blog on July 30, 2021, prior to Cyara’s acquisition of Botium. Learn more about Cyara + Botium

All chatbot testers are dreaming of two buttons. One for detecting all problems of a chatbot, and another one for fixing them all.

With OpenAI we were able to add some nice features to Botium Box which are going in that direction.

  • We are using Open AI to guess what can be the next message in a conversation
  • We integrated a new, multilingual paraphraser that takes the risk while generating new alternatives.

 

Here are some tips and use cases that we learned while developing those features.

 

Normalize text

Documentation of OpenAI says that it is better to check spelling mistakes in the text. But we can go on. Text has to be as strict as possible. Some samples:

  • Avoid using enters. For example, we can confuse OpenAI if we allow a new line in the “human” section of a chat.
  • Terminate the sentences with dots. (OpenAI will deal with it as a sentence, won’t continue it for example)
  • Avoid unnecessary information. For example, we distinguish between user writes the text “pizza”, or just pushes the “pizza” button. Second one is something like this: “#user button:pizza” for us. But this information can be confusing for OpenAI, it’s better to use just “#user pizza”.

 

Use enter at the end of the prompt correctly

This:

Human: Hello, who are you?
AI: I am an AI created by OpenAI. How can I help you today?
Human: Can I ask you

and this:

Human: Hello, who are you?
AI: I am an AI created by OpenAI. How can I help you today?
Human: Can I ask you

 

...is a big difference for OpenAi, but it is just an extra enter. In the first case, OpenAI continues the human message, in the second it generates a new AI message.

 

Multilingual translator

Building a multilingual translator is not difficult. There is a sample for a translator in Open AI playground. We can use the same parameters and data structure, but with multilingual samples:

English: See you later!
French: À tout à l'heure!

German: Ich möchte Geld überweisen.
English: I want to transfer money.

Russian: Я хотел бы заказать пиццу.
German: Ich möchte Pizza bestellen

 

That works well, but we have to define explicitly the language of the source text. We will overcome this restriction in the next step.

 

Multilingual translator, basic language is not specified

In order to do it, we have to restructure the prompt a bit:

Text: See you later!
LanguageOfResult: French
Result: À tout à l'heure!

Text: Ich möchte Geld überweisen.
LanguageOfResult: English
Result: I want to transfer money.

Text: Я хотел бы заказать пиццу.
LanguageOfResult: German
Result: Ich möchte Pizza bestellen

 

But let's play with it a little bit more.

 

Multilingual translator and language detector

If we add a new field, then we can ask OpenAI for language detection:

Text: See you later!
LanguageOfResult: French
Result: À tout à l'heure!
LanguageOfText: English

Text: Ich möchte Geld überweisen.
LanguageOfResult: English
Result: I want to transfer money.
LanguageOfText: German

Text: Я хотел бы заказать пиццу.
LanguageOfResult: German
Result: Ich möchte Pizza bestellen
LanguageOfResult: Russian

 

It is a nice experiment. Sadly it does not recognize language is always good, but reveals some facts:

  • It is possible to teach OpenAI parameters (Text, LanguageOfResult), and return values (Result, LanguageOfText) Of course they are not parameters, and return values for OpenAI, but for us.
  • OpenAI can solve more complex tasks. What is a two-step solution for us, it is maybe not for OpenAI.

 

Paraphraser

Sure, there are more ways to create a prompt for a paraphraser. There is no best solution, each has its own pros and cons. Our paraphraser is very simple, but it does what we need.

Pros:

  • It is open, it finds really nice new alternatives.
  • Prompt is just 2 rows (less cost)
  • It is not stuck to any language. (Some prompts, like our translator have a training section, and a request section. If we create some ‘hard coded’ training sections to paraphrase in English, then we get a request to paraphrase german sentences, then we will have a prompt in multiple languages. It can be misleading for OpenAI)

Cons:

  • About half of the result is not a good paraphrasing. But it does not mean that it’s worthless! If we got “Can I replace my card if it is lost or stolen?” for “Someone took my card!” and “What is the procedure to report a stolen card?” then we see that it does not fit. But in our case for a Banking chatbot, it can indicate a new use case.