The Response AI Step allows Voiceflow's AI Assistants to generate contextual speech or text responses, in real-time, as your users interact with your Assistant. Instead of adding a specific response to a flow in your Assistant, you can instead outline a prompt for the response you'd like to provide, and any variables you'd like the prompt to include.
This will create a dynamic experience for users, by leveraging LLMs to generate informational content for your users' question, or simply giving some variety to how your Assistant can respond.
The Response AI Step is an experimental feature leveraging Large Language Models (LLMs) and should not be used in production use cases for business critical applications because of its potential to generate misleading or false information. For that reason, you will be required to opt-in to use this functionality.
The Response AI Step will be available for you by default on all AI Assistant projects, but will not be available on any NLU Design project types.
Adding a Response AI Step to your Assistant
Once enabled, you can find the Response AI Step in your Assistant's Steps Menu, under the AI section. The step can be added into your Assistant anywhere that you might add a Text or Speak Step. Once you have placed your step, you can configure it in the Editor.
To configure your generative response, in the Prompt field, provide a description for the type of response you'd like the AI to generate. You can leverage variables from your project within this prompt, to make it dynamic. You have two options for the Data Source your LLM will leverage: AI Model or Knowledge Base. Default setting is AI Model.
Knowledge Base Data Source Considerations:
- If you're using Knowledge Base as the Data Source in your Response AI step, please note that you cannot use prompt-engineering to prescribe a specific data file to use. The Response AI step searches the entire Knowledge Base, selecting data “chunks” with highest relevance to the question/prompt. Keep this in mind when curating your Knowledge Base content (”garbage in, garbage out”).
- Knowledge Base Settings (model, temp, etc.) are only configurable on the Assistant level.
Configuring your Prompt
There are currently 4 ways to configure the prompt you've provided to modify the potential output:
- Model - This is the model that will be used to created your prompt completion. Each model has it's own strengths, weaknesses, and token multiplier, so select the one that is best for your task.
- GPT-3 DaVinci - most stable performance, best suited for simpler functions
- GPT-3.5-Turbo - fast results, average performance on reasoning and conciseness
- GPT-4 - most advanced reasoning and conciseness, slower results (only available on Pro and Enterprise Plans)
- Claude 1 - consistently fast results, moderate reasoning performance
- Claude Instant 1.3 - fastest results, best suited for simpler functions
- Claude 2 - advanced context handling (strong summarization capabilities)
- Temperature - This will allow you to influence how much variation your responses will have from the prompt. Higher temperature will result in more variability in your responses. Lower temperature will result in responses that directly address the prompt, providing more exact answers. If you want 'more' exact responses, turn down your temperature.
- Max Tokens - This sets the total number of tokens you want to use when completing your prompt. The max number of tokens available per response is 512, (we include your prompt and settings on top of that). Greater max tokens means more risk of longer response latency.
- System - This is the instruction you can give to the LLM model to frame how it should behave. Giving the model a 'job' will help it provide a more contextual answer. Here you can also define response length, structure, personality, tone, and/or response language. System instructions get combined with the question/prompt, so be sure they don't contradict.
Testing your Generated Responses
You can test your prompt using the Preview button, which will ask you to provide an example variable value if you have included one.
When you run your Assistant in the Test Tool or in Sharable Prototypes, any Generate Steps you have configured will be active, generating their response content dynamically.
The Response AI Step is still an experimental feature, and for this reason and is not recommended to be used for serious production use cases.