The Response AI Step allows Voiceflow's AI Assistants to generate contextual speech or text responses, in real-time, as your users interact with your Assistant. Instead of adding a specific response to a flow in your Assistant, you can instead outline a prompt for the response you'd like to provide, and any variables you'd like the prompt to include.
This will create a dynamic experience for users, by leveraging LLMs to generate informational content for your users' question, or simply giving some variety to how your Assistant can respond.
The Response AI Step is an experimental feature leveraging Large Language Models (LLMs) and should not be used in production use cases for business critical applications because of its potential to generate misleading or false information. For that reason, you will be required to opt-in to use this functionality.
The AI Set Step will be available for you by default on all AI Assistant projects, but will not be available on any NLU Design project types.
Adding a Response AI Step to your Assistant
Once enabled, you can find the Response AI Step in your Assistant's Steps Menu, under the Talk section. The step can be added into your assistant anywhere that you might add a Text or Speak Step. Once you have placed your step, you can configure it in the Editor.
To configure your generative response, in the Prompt field, provide a description for the type of response you'd like the AI to generate. Be sure to include any variables you want it to include in the prompt.
Configuring your Prompt
There are currently 4 ways to configure the prompt you've provided to modify the potential output:
- Model - This is the model that will be used to created your prompt completion. Currently, we are offering a selection of OpenAI model, but will expand this selection in the future. Each different model has it's own strengths and weaknesses, so be sure to select the one that is best for your task:
- GPT-3 DaVinci - provides the most stable performance, but is better suited for simpler functions
- GPT-3.5-Turbo - provides the fastest results on your prompt, with average performance on reasoning and conciseness
- GPT-4 - provides the most advances reasoning and conciseness, but will have slower results than the other options
- Temperature - This will allow you to control how much variation your completion will have from the prompt. Higher temperature will result in more variability in your responses. Lower temperature will result in responses that directly address the prompt, providing exact answers.
If you want more exact responses, turn down your temperature.
- Max Tokens - This sets the total number of tokens you want to use when completing your prompt. The max number of tokens available per response is 2048, (we include your prompt and settings on top of that).
- System - This is the instruction you can give to the LLM model to frame how it should behave. Giving the model a 'job' will help it provide a more contextual answer. (This is not available for all Models.)
Testing your Generated Responses
You can test your prompt using the Preview button, which will ask you to provide an example variable value if you have included one.
When you run your assistant in the Test Tool or in Sharable Prototypes, any Generate Steps you have configured will be active, generating their response content dynamically.
The Response AI Step is still an experimental feature, and for this reason and is not recommended to be used for serious production use cases.
Please sign in to leave a comment.